International Journal of Engineering and Technical Research (IJETR) 
ISSN: 2321-0869 (O) 2454-4698 (P), Volume-5, Issue-4, August 2016 


A Review: Enhancement of Brain Computer Interface 

Simerjit Kaur, Jashanpreet kaur 


Abstract — An effective brain computer interface (BCI) 
leverages the separate strengths of both human and machine to 
create new capabilities and leaps in efficiencies. With B -Alert 
BCI development tools, developers are provided rapid 
prototyping tools to fit the right approach to the right task. 
Within clinical environments, the results are recovery of lost 
function and accelerated healing. In other applications, BCI’s 
facilitate more efficient interactions between man and machine. 
The work focus on P300 (Type of EEG signal) signal processing, 
feature extraction from the processed signals, discovering signal 
classes, classification and interpretation of unknown signals. 

Index Terms — BCI, P300, EEG Signal, SOM 

i. Introduction 

The work focus on P300 (Type of EEG signal) signal 
processing, feature extraction from the processed signals, 
discovering signal classes, classification and interpretation of 
unknown signals. The research methodology involves 
following steps: 

• EEG Data Sets (Already Collected) 

• Signal Preprocessing 

• Feature Extraction 

• Knowledge Discovery using SOM 

• Classification using Classifier Ensemble 

• Comparing Accuracy with already work done 

a. Signal acquisition 

In the BCIs discussed here, the input is EEG recorded from 
the scalp or the surface of the brain or neuronal activity 
recorded within the brain. Electrophysiological BCIs can be 
categorized by whether they use non-invasive (e.g. EEG) or 
invasive (e.g. intracortical) methodology. They can also be 
categorized by whether they use evoked or spontaneous 
inputs. Evoked inputs (e.g. EEG produced by flashing letters) 
result from stereotyped sensory stimulation provided by the 
BCI. Spontaneous inputs (e.g. EEG rhythms over sensor 
motor cortex) do not depend for their generation on such 
stimulation. There is, presumably, no reason why a BCI could 
not combine non-invasive and invasive methods or evoked 
and spontaneous inputs. In the signal-acquisition part of BCI 
operation, the chosen input is acquired by the recording 
electrodes, amplified, and digitized[15] 
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b. Signal processing- 

The goal of signal analysis in a BCI system is to maximize the 
signal-to-noise ratio (SNR) of the EEG or single-unit features 
that carry the user’s messages and commands. To achieve this 
goal, consideration of the major sources of noise is essential . 
Noise has both non neural sources (e.g., eye movements, 
EMG, 60-Hz line noise) and neural sources (e.g., EEG 
features other than those used for communication). Noise 
detection and discrimination problems are greatest when the 
characteristics of the noise are similar in frequency, time or 
amplitude to those of the desired signal. For example, eye 
movements are of greater concern than EMG when a slow 
cortical potential is the BCI input feature because eye 
movements and slow potentials have overlapping frequency 
ranges. 

Numerous options are available for BCI signal processing. 
Ultimately, they need to be compared in on-line experiments 
that measure speed and accuracy. The new Graz BCI system, 
based on Matlab and Simulink, supports rapid prototyping of 
various methods. Different spatial filters and spectral analysis 
methods can be implemented in Matlab and compared in 
regard to their online performance. Autoregressive (AR) 
model parameter estimation is a useful method for describing 
EEG activity. 

Signal processing methods are important in BCI design, but 
they cannot solve every problem. While they can enhance the 
signal-to-noise ratio, they cannot directly address the impact 
of changes in the signal itself. Factors such as motivation, 
intention, frustration, fatigue, and learning affect the input 
features that the user provides. Thus, BCI development 
depends on appropriate management of the adaptive 
interactions between system and user, as well as on selection 
of appropriate signal processing methods [14]. 

c. Feature extraction 

The digitized signals are then subjected to one or more of a 
variety of feature extraction procedures, such as spatial 
filtering, voltage amplitude measurements, spectral analyses, 
or single-neuron separation. This analysis extracts the signal 
features that (hopefully) encode the user’s messages or 
commands. BCIs can use signal features that are in the time 
domain (e.g. evoked potential amplitudes or neuronal firing 
rates) or the frequency domain. A BCI could conceivably use 
both time domain and frequency-domain signal features, and 
might thereby improve performance [14]. 

d. The translation algorithm 

The first part of signal processing simply extracts specific 
signal features. The next stage, the translation algorithm, 
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translates these signal features into device commands orders 
that carry out the user’s intent. This algorithm might use linear 
methods (e.g. classical statistical analyses (Jain et al., 2000) 
or nonlinear methods (e.g. neural networks). Whatever its 
nature, each algorithm changes independent variables (i.e. 
signal features) into dependent variables (i.e. device control 
commands). 

A translation algorithm is a series of computations that 
transforms the BCI input features derived by the signal 
processing stage into actual device control commands. Stated 
in a different way, a translation algorithm takes abstract 
feature vectors that reflect specific aspects of the current state 
of the user’s EEG or single-unit activity (i.e., aspects that 
encode the message that the user wants to communicate) and 
transforms those vectors into application-dependent device 
commands. Different BCI’s use different translation 
algorithms (e.g., [3]-[9]). Each algorithm can be classified in 
terms of three key features: transfer function, adaptive 
capacity, and output. The transfer function can be linear (e.g., 
linear discriminate analysis, linear equations) or nonlinear 
(e.g., neural networks). The algorithm can be adaptive or non 
adaptive. Adaptive algorithms can use simple handcrafted 
rules or more sophisticated machine-learning algorithms. The 
output of the algorithm may be discrete (e.g., letter selection) 
or continuous. [4] 

e. The output device 

For most current BCIs, the output device is a computer 
screen and the output is the selection of targets, letters, or 
icons presented on it. Some BCIs also provide additional, 
interim output, such as cursor movement toward the item prior 
to its selection. In addition to being the intended product of 
BCI operation, this output is the feedback that the brain uses 
to maintain and improve the accuracy and speed of 
communication. Initial studies are also exploring BCI control 
of a neuroprosthesis or thesis that provides hand closure to 
people with cervical spinal cord. In this prospective BCI 
application, the output device is the user’s own hand. 

f. The operating protocol 

Each BCI has a protocol that guides its operation. This 
protocol defines how the system is turned on and off, whether 
communication is continuous or discontinuous, whether 
message transmission is triggered by the system (e.g. by the 
stimulus that evokes a P300) or by the user, the sequence and 
speed of interactions between user and system, and what 
feedback is provided to the user. Most 

protocols used in BCI research are not completely suitable for 
BCI applications that serve the needs of people with 
disabilities. Most laboratory BCIs do not give the user on/off 
control: the investigator turns the system on and off. Because 
they need to measure communication speed and accuracy, 
laboratory BCIs usually tell their users what messages or 
commands to send. In real life the user picks the message. 
Such differences in protocol can complicate the transition 
from research to application. 

A standard P300 signal Dataset that has already been 
collected. The BCI competitions have been used to collect the 
datasets of P300 signals. These signals will be pre-processed 


which includes amplification; filtering and then the signals are 
digitized for further feature extraction and classification 
purpose. The P300 signals are non-stationary and 
self-generated signals. 

For better interpretation of the EEG signal in time-domain 
and frequency-domain simultaneously, wavelet Transform 
(WT) and wavelet Packet Transform (WPT) are good 
analysis tools. Also, the extensive research has been discussed 
for feature extraction in P300 based BCI systems using 
wavelet theory or wavelet packet decomposition. Knowledge 
Discovery is the process of discovering new patterns from 
large data sets. Here Self-organizing Feature Map will be used 
to discover classes from signals. The pre-processed wavelet 
vectors form ‘clusters’ on the trained SOM that are related to 
P300 patterns. Every detected class depicted as a cluster on 
the map. For the classification of the unknown data samples, 
various types of classifier exist. A variety of techniques exists 
for classification purpose like artificial neural network, 
Back-propagation Neural Network, Hidden Markov Model 
(HMM) and Bayes Network etc. 

Recent work has shown that ensemble learning has employed 
combining classifiers. This combining classifier approach has 
solved the problem of reducing variance as unstable 
classifiers can have universally low bias and high variance. 
There are various ensemble learning methods, commonly 
used are Bagging, Boosting, Stacking and Voting. Therefore, 
classifier ensemble (a recent trend in classifier combination) 
will be used to obtain a better classification. 

II. LITERATURE reveiw 

Anupama.H.S, N.K.Cauvery, Lingaraju.G.M (2012) 
proposed that A Brain Computer Interface (BCI) provides a 
communication path between human brain and the computer 
system. With the advancement in the areas of information 
technology and neurosciences, there has been a surge of 
interest in turning fiction into reality. The major goal of BCI 
research is to develop a system that allows disabled people to 
communicate with other persons and helps to interact with the 
external environments. This area includes components like, 
comparison of invasive and noninvasive technologies to 
measure brain activity, evaluation of control signals (i.e. 
patterns of brain activity that can be used for communication), 
development of algorithms for translation of brain signals into 
computer commands, and the development of new BCI 
applications. This Paper provides an insight into the aspects 
of BCI, its applications, recent developments and open 
problems in this area of research. 

Jonathan R. Wolpawa, Niels Birbaumer, Dennis J. 
McFarlanda (2002) proposed that For many years people 
have speculated that electroencephalographic activity or other 
electrophysiological measures of brain function might 
provide a new non-muscular channel for sending messages 
and commands to the external world - a brain-computer 
interface (BCI). Over the past 15 years, productive BCI 
research programs have arisen. Encouraged by new 
understanding of brain function, by the advent of powerful 
low-cost computer equipment, and by growing recognition of 
the needs and potentials of people with disabilities, these 
programs concentrate on developing new augmentative 
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communication and control technology for those with severe 
neuromuscular disorders, such as amyotrophic lateral 
sclerosis, brainstem stroke, and spinal cord injury. The 
immediate goal is to provide these users, who may be 
completely paralyzed, or ‘locked in’, with basic 
communication capabilities so that they can express their 
wishes to caregivers or even operate word processing 
programs or neuroprostheses. 

Brent J. Lance and Kaleb McDowell(2012) proposed that As 
the proliferation of technology dramatically infiltrates all 
aspects of modern life, in many ways the world is becoming so 
dynamic and complex that technological capabilities are 
overwhelming human capabilities to optimally interact with 
and leverage those technologies. Fortunately, these 
technological advancements have also driven an explosion of 
neuroscience research over the past several decades, 
presenting engineers with a remarkable opportunity to design 
and develop flexible and adaptive brain-based 
neurotechnologies that integrate with and capitalize on human 
capabilities and limitations to improve human-system 
interactions. Major forerunners of this conception are 
brain-computer interfaces (BCIs), which to this point have 
been largely focused on improving the quality of life for 
particular clinical populations and include, for example, 
applications for advanced communications with paralyzed or 
“locked-in” patients as well as the direct control of prostheses 
and wheelchairs. 

Luis Fernando Nicolas- Alonso and Jaime Gomez-Gil (2012) 
proposed that a brain-computer interface (BCI) is a hardware 
and software communications system that permits cerebral 
activity alone to control computers or external devices. The 
immediate goal of BCI research is to provide communications 
capabilities to severely disabled people who are totally 
paralyzed or ‘locked in’ by neurological neuromuscular 
disorders, such as amyotrophic lateral sclerosis, brain stem 
stroke, or spinal cord injury. Here, we review the 
state-of-the-art of BCIs, looking at the different steps that 
form a standard BCI: signal acquisition, preprocessing or 
signal enhancement, feature extraction, classification and the 
control interface. 

III. OBJECTIVE 

• Investigate the event-related potential (ERP) response 
for the P300-based brain-computer interface speller. 

• A signal preprocessing method integrated coherent 
average, principal component analysis (PCA) and 
independent component analysis (ICA) to reduce the 
dimensions and noise in the raw data. 

• The time-frequency analysis will be based on wavelets. 

IV. Methodolgy 

A research methodology provides us the basic concept if other 
has used techniques or methods similar to the ones we are 
proposing, which technique is best appropriate for them and 
what kind of drawbacks they have faced with them. Hence, we 
will be in better position to select a methodology that is 
capable of providing a valid answer to all the research 
questions which constitutes research methodology. At each 
step of our operation we are provide d with multiple choices 


either to take this scenario or use any other, which will let us 
to define and help us to achieve objective. Thus knowledge 
base of research paper methodology plays an important role. 

RESEARCH PLAN 

The whole program is divided into 3 phases: 

PHASE 1 

• load the training dataset 

• select the specific channel in which P300 signals are 
present 

• lowpass and highpass butterworth filtering 

• coherence averaging 

• ICA 

• PCA 

• wavelet filtering to extract the features to be trained using 
db4 wavelet 

• k means clustering of the obtained features 

• SVM training of the clusters obtained after k means 

PHASE 2 

• load the testing dataset 

• select the specific channel in which P300 signals are 
present 

• low pass and high pass butter worth filtering 

• coherence averaging 

• ICA 

• PCA 

• wavelet filtering to extract the features to be trained using 
db4 wavelet 

• k means clustering of the obtained features 
PHASE 3 

Classify using trained clusters from Phase 1 and features from 
Phase 2 

V. FUTURE SCOPE 

The discussed study shows that the question about the 
mechanisms generating the ERP in the human EEG is still far 
from being answered. It is noteworthy that several studies 
yielding evidence for phase resetting argue that phase reset 
may be only one mechanism which is involved in ERP 
generation, but they also provide evidence for an evoked 
response. The crucial point, is to quantify the contribution of 
each mechanism. 
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A Compact Detector Set for Artificial Immune 

Systems 

Nguyen Van Truong 


Abstract — Negative selection algorithms (NS A) are methods 
inspired by the T cells maturating process. They all comprise of 
two phases: generation of detectors that match none of the self 
samples, and classification of monitored elements as self or 
nonself using these detectors. However, the detector sets 
generated may be redundant. In this paper, we propose a new 
negative selection algorithm to generate a complete and 
non-redundant detector sets that use an extension of r-chunk 
matching rule. This allows to reduces detectors storage and 
classification time. Experimental results on four datasets show 
the effective of proposed algorithm. 

Index Terms — Immune system, negative selection, r-chunk, 
detectors generation. 



I. Introduction 

In the field of Artificial Immune Systems (AIS), negative 
selection algorithm is class of techniques inspired by the T 
cells maturating process that happens in thymus. The 
discriminating mechanism between self (signal of a healthy 
cell) and nonself (signal of an unhealthy cell) of T cells are 
modeled by NS As. T cells are first generated randomly and in 
a large number, in the hope that every pathogen that might 
infect the host could be detected by at least some of these 
cells. However, the host must ensure that no cell generated 
would turn against itself (autoimmune reactions). Hence, 
newborn T cells must undergo the process of selection to 
ensure that they are able to recognize nonself. This process 
might be conducted by a negative selection: if a T cell detects 
any self protein, it is discarded; otherwise, it survives [4]. 

Given a collection of self patterns S, a typical NS A 
comprises of two phases: detector generation and detection 
[2], [12]. In the detector generation phase (Fig. l.a), the 
detector candidates are generated randomly and censored by 
matching them against given self samples taken from the set 
S. The candidates that match any element of S are eliminated 
and the rest are kept and stored in the set D. In the detection 
phase (Fig. l.b), the collection of detectors are used to 
distinguish self (system components) from nonself (outlier 
like viruses, worms, etc.). If an incoming data instance 
matches any detector, it is claimed as nonself, and it is 
claimed as self otherwise. 

From a machine learning perspective, negative selection is 
usually described as an anomaly detection technique. Since its 
introduction, NS A has been a source of inspiration for many 
computing applications, especially for intrusion detection [4], 

[14] , computer virus detection [9], monitoring UNIX 
processes [8], spam detection [18], modeling of 
immunological processes such as HIV infection modeling 

[15] . 
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(a) Generation of detector set (b) Detection of new instances 

Fig. 1. Outline of a typical negative selection algorithm [13]. 

For binary-based AIS (i.e. the representations for cells, 
detectors are binary strings), r-chunk and r-contiguous bits 
(rcb) are two common matching rules used for the 
construction of detector set (and also for the detection phase). 
An r-chunk matching rule can be seen as a generalization of 
the rcb matching rule. To date, there have been some 
computing models of binary detectors that could generate a 
complete and non-redundant detector set, called perfect 
detector set, a set of minimum detectors with the same 
detection ability in comparing to that of all possible detectors, 
such as those based on prefix trees [5], [16], or automata [6]. 
In these models, detectors are represented as a whole structure 
(tree or automata) rather than a set of individual strings. While 
they provide a more compact representation of the detector 
sets for AIS and therefore achieve a better detection time 
complexity, these models of binary-based AIS are hard to 
deploy in distributed environments. Naturally, one desirable 
property of NS A is its ability to be implemented in a 
distributed manner - each detector might detect different kind 
of nonself, this is desirable for many applications such as in 
computer security systems. Therefore, the focus of this paper 
is on binary-based NS As that employ a discrete set of 
detectors (strings) so that they can be implemented in 
distributed environments. We can, for example, randomly 
divide the discrete set of detectors into some subsets, each one 
for a nodes in in distributed environments. 

With respect to binary-based AIS using discrete detector 
set, to the best of our knowledge, the only algorithm for 
generating a perfect and discrete set of r-chunk (rcb-based) 
detectors was proposed by T. Stibor in [21] (by S. T. Wierzchon 
in ([20]), which has frequently been cited, compared, and 
applied in the literature with 44 (47) citations on Google 
Scholar. The main contribution of our paper is to design new 
deterministic algorithm to generate a perfect and discrete set 
of rcbvl-based detectors, which is equitable to a full set of 
r-chunk-based detectors in term of anomaly detection. 
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Moreover, compact string-based detectors set can achieve 
better memory and time complexities compared to 
conventional algorithms. 

The rest of the paper is organized as follows. In the next 
section, we first present some basic terms and definitions. 
After the introduction of strings, two common matching rules 
for generating detector set, r-chunk and rcb are given. Then 
we introduce an r-contiguous bit matching rule with variable 
length (rcbvl) detectors. This new type of detector set is more 
compact, in bits, than one bases on original r-chunk and rcb. 
Prefix trees are introduced as temporary data structures for 
generation. Section 3 details our new NS A that can generate 
perfect detector sets base on rcbvl. Section 4 briefly describes 
our experiments in generating perfect detector sets. Section 5 
concludes the paper and discuss some possible future works.. 

II. Basic Terms And Definitions 

In NSAs, an essential component is the matching rule 
which determines the similarity between detectors and self 
samples (in the detector generation phase) and coming data 
instances (in the detection phase). Obviously, the matching 
rule is dependent on detector representation. In this paper, 
both self and nonself cells are represented as binary strings of 
fixed length. This representation is the most simple and 
popular representation for detectors and data in AIS, and 
other representations (such as real valued) could be reduced 
to binary [10], [13]. 

A. Strings 

An alphabet X is nonempty and finite set of symbols. A 
string s E X* is a sequence of symbols from X, and its length is 
denoted by Isl. A string is called empty string if its length 
equals 0. Given an index iE { 1 ,...,lsl } , then s[i] is the symbol 
at position i in s. Given two indices i and j, whenever j > i, 
then s[i...j] is the substring of s with length j- i+1 that starts at 
position i and if j < i, then s[i...j] is the empty string. A string 
s’ is a prefix of s if s’ = s[l...j], 1 < j < |s|. 

Given a string s E X f , a non-empty string d, and an index i E 
{1,..., i - r + 1 }, we say that d occurs in s at position i if s[i...i 
+ |d| - 1] = d. Moreover, concatenation of two strings s and s’ 
is s + s’. 

Although our approaches can be implemented on any finite 
alphabet, but strings used in all examples are binary, X = 
{0,1 }, just for easy understanding. 

B. R-chunk and rcb matching rules 

For binary-based AIS, the rcb and r-chunk are among the 
most common matching rules. Given a positive integer r, a set 
S of self strings of length i . A detector under rcb matching 
rule is a string of length i that does not match any s E S. It is 
said to match another string, of the same length, if they have r 
consecutive matching bits in the corresponding positions. Rcb 
was introduced and used in many AIS projects [7], [1 1], [17]. 
An r-chunk detector is a tuple of a string of r bits and its 
starting position with the string that does not match any s E S. 
An r-chunk detector (d,i) is said to match a string s if d is a 
prefix of s[i..lsl]. An r-chunk matching rule is considered as a 
simplification of the rcb matching rule [17]. This type of 
detector helps AIS to achieve better results on data where 
adjacent regions of the input data sequence are not necessarily 
semantically correlated, such as in network data packets [3]. It 
is noted that an r-contiguous detector [4] can be decomposed 
into i - r + 1 overlapping r-chunk detectors. 


Example 1 : Let i = 6, r = 3. Given a set of five self strings S 
= {Si = 010101, s 2 = 111010, s 3 = 101101, s 4 = 100011, s 5 = 
010111}. The set of all r-chunk detectors is {(000,1), (001,1), 
(011,1), (110,1), (001,2), (010,2), (100,2), (111,2), (000,3), 

(100.3) , (111,3), (000,4), (001,4), (100,4), (110,4)}. The set 
of all detectors under rcb matching rule is {001000, 001001, 
011110, 110000, 110001}. 

C. Rcbvl matching rule 

Given a positive integer r, a set S of self strings of length i . 
A triple (d,i,j) of a string d E Xk, 1 < k < f , an integer i E 
{l,...,f-r+l} and an integer j E {i,...,f-r+l } is called a 
negative detector under rcbvl matching rule if d does not 
occur in any s, s E S. In another words, (d,i,j) is an rcbvl 
detector if there exist j -i+1 r-chunk detectors (d 1 ,i),..., (d 
j-i+1 , j) that d k , d k+1 are two (r - l)-bit overlapping strings, 

k = l,.-,j -i- 

Example 2: Given f, r and the set S of self strings as in 
Example 1. Triple (0001,1,2) is an rcbvl detector because 
there exist two 3-chunk detectors (000,1), (001,2) that 000 
and 001 are two 2-bit overlapping strings. A perfect detector 
set D under rcbvl matching rule contains 5 variable length 
detectors {(0001,1,2), (00100,1,4), (100,4,4), (011110,1,4), 

(11000.1.3) }. It is a minimum detectors set (23 bits) that 
covers all detector space of r-chunk detectors set in Example 
1 (45 bits). 

D. Prefix trees 

A prefix tree T is a rooted directed tree with edge labels 
from X where for all a E X, every node has at most one 
outgoing edge labeled with a. For a string s, we write s E T if 
there is a path from the root of T to a leaf such that s is the 
concatenation of the labels on this path. This tree structure is 
important for generating rcbvl detectors set. 

Example 3: Given f, r and the set of self strings S as in 
Example 1, four prefix trees presenting all binary 3 -chunk 
detectors are in Fig. 2. 



Tree Ti presents 3 -chunk detectors (d,i), i = 1,...,4. 


III. New Negative Selection Algorithm 

Given a non-empty set S of self strings of length i , and an 
integer rE { l,...,f - r + 1}, this section presents a new NS A 
bases on rcbvl matching rule. 

A. Detectors set generation under rcbvl matching rule 

Algorithm 1 Algorithm to generate perfect rcbvl detector 
set. 

1: procedure G ENERATION D ETECTORS (S, f, r, D) 

2: for i = l,...,f - r + 1 do 

3: Create an empty prefix tree Ti 
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4: end for 

5: for all s E S do 

6: for i = l,...,f - r + 1 do 

7: insert every s[i...i + r - 1] into Tj 

8: end for 

9: end for 

10: for i = l,...,f - r + 1 do 
11: for all nonleaf node nGTj and all a E 2 do 
12: if no edge with label a starts at n then 

13: create a new leaf n’ and an edge (n,n’) 

labeled with a. 

14: end if 

15: end for 

16: delete every node nETj from which none of the 
newly created leaves is reachable. 

17: end for 

18: Di = 0 

19:D={(s,l,l)lsET 1 } 

20: for i = 2,...,f - r + 1 do 
21: D 2 = 0 

22: for all (s,k,j) E D do 
23: if there exists as’ Eh where s[i-k+l...|s|] 
is prefix of it then 

24: D 2 = D 2 u {(s+s’[|s|-j+k...|s’l],k,i)} 

25: delete every node nETj from which only 
nodes in the s’ is reachable 
26: for all s’ E Ti where s[i - k + l...|s|] is 
prefix of it do 
27: if |s| - i + k < r then 
28: D 2 = D 2 u{(s[lsl] + s’,i - l,i)} 

29: else 

30: D 2 = D 2 u{(s’,i,i)} 

31: end if 

32: delete every node nETj from which 
only nodes in the s’ is reachable 
33: end for 
34: else 

35: D^huK*)} 

36: end if 

37: end for 

38: for all s’E Ti do 

39: D 2 = D 2 u{(s’,i,i)} 

40: end for 
41: D = D 2 
42: end for 
43:D = DuDi 
44: end procedure 

Algorithm 1 summarizes the first phase of new NSA. Some 
prefix trees are first used to generate perfect detectors set 
from S and then this set is used to distinguish if a new sample 
as self or nonself. In the algorithm, the process of generating 
We first construct for every position i E {l,...,f - r + 1} a 
prefix tree Ti . Each prefix tree Ti can be constructed as 
follows: start with an empty prefix tree and insert every s[i...i 
+ r - 1], s E S, into it (lines 5-9). Next, for every non-leaf node 
n and every oEX where no edge with label a starts at n, create 
a new leaf n’ and an edge (n,n’) labeled with a. Finally, delete 
every node from which none of the newly created leaves is 
reachable (lines 10-17). Detectors set D is first created by all s 
E T 1 in line 19. The rest of the algorithm, lines 20-42, 
updates partial detectors in D by identifying their right 
overlapping strings in prefix trees. 


From the description of the algorithm, it takes ISI.(f-r+l).r 
steps to generate (f-r+1) prefix trees and IDI.(f-r+l).2 r steps 
to generate perfect detector set D. 

Example 4: Given f, r and the set of self strings S as in 
Example 1. Some steps in the Algorithm 1 generating a 
perfect detector set as in Example 2 are: Set D is first created 
as (00,1,1), (011,1,1), (110,1,1). Then the for loop (lines 
20-42) calculates D and D x as following: 

For i = 2: D = (0001,1,2); (0010,1,2); (0111,1,2); (1100,1,2) 
and Di = 0. For i = 3: D = (00100,1,3); (01111,1,3); 

(11000.1.3) and D x = (0001,1,2). For i = 4: D = (00100,1,4); 
(011110,1,4) and = (0001,1,2); (11000,1,3); (100,4,4). 
The final step, D = DuDi in line 43, generates the perfect 
detector set {(0001,1,2), (00100,1,4), (100,4,4), (011110,1,4), 

(11000.1.3) }. 

B. Detection under Rcbvl matching rule 

To detect if a given string s is self or nonself, we simply 
check our Rcbvl matching rule on s against every detector in 
D. If it is the case, output s is nonself, otherwise s is self. The 
function min used in Algorithm 2 to return the smallest 
number from two values. It is easy to see that this algorithm 
has the same time complexity with Algorithm 1 . 

Algorithm 2 Algorithm to detect if a given string s is self or 
nonself. 

1: procedure DETECTION (s, f, r, D) 

2: for all (s’,n,m) E D do 
3: for i = n,...,m do 

4: if s’[i...min(i + r - l,|s’l)] occurs in s at position i then 

5 : output s is nonself 

6: exit procedure 

7: end if 

8: end for 

9: end for 

10: output s is self 

1 1 : end procedure 

IV. Experiments 

We use a popular flow-based datasets NetFlow [19] and a 
random dataset for experiments. The flow-based NetFlow is 
generated from packet-based DARPA dataset [1] is used for 
experiment 1 . This dataset focuses only on flows to a specific 
port and a IP address which receives the most number of 
attacks. It contains all 129,571 traffics (including attacks) to 
and from victims. Each flow in the datasets has 10 fields: 
Source IP, Dest. IP, Source Port, Dest. Port, Packets, Octets, 
Start Time, End Time, Flags, and Proto. Similar to the 
previous studies [19], we select the same 4 features Packets, 
Octets, Duration and Flags from the NetFlow dataset as the 
input of two experiments in case study 1 . A randomly created 
dataset is used for experiments in case study 2. This dataset 
contains 50,000 binary string with the length of 30. 

Flows in NetFlow are converted into binary strings by two 
steps. The first step is to map all features to binary string 
features. After this step, a total string features are constructed 
for both normal data and anomalous one. The second step is to 
concatenate the binary string features for every flows. After 
this step, dataset contains binary strings with their length of 
49. The distributions of training and testing datasets as well as 
parameters r, i for 4 experiments are described in Table 1. 

Table 1. Data and parameters distribution for experiments 
and result comparison. 
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Size (bit) 

Time (mil. Sec.) 

l 

r 

Train 

Test 

r-chunk 

rcbvl 

r-chunk 

rcbvl 

Case 1 

49 

10 

119571 

10000 

206810 

42704 

58962 

11960 

49 

8 

79571 

50000 

31672 

8096 

48952 

11085 

Case 2 

30 

12 

25000 

25000 

367092 

79222 

243979 

44995 

30 

14 

40000 

10000 

232405 

6 

39281 

5 

518999 

82922 



Fig. 3. Size of detectors comparisons 

■ Rnihunk ■ Rcbvl 



Fig. 4. Classification time comparisons 


Results in Table 1 show that our proposed algorithm reduce 
both size (bits) of detectors and time (milliseconds) to classify 
testing dataset. The comparisons of detection time and 
detectors’ size are illustrated in Fig. 3 and Fig. 4, respectively. 

V. Conclusion 

In this paper, we have proposed a novel NS A to generate 
perfect detector sets for string -based AIS. We developed a 
rcbvl matching rule as an extension of traditional rcb. Our 
new algorithm has a polynomial time complexity. More 
importantly, proposed algorithm always generate complete 
and non-redundant detector sets for string -based AIS. 
Experiment results show that proposed algorithm can reduce 
both detection phase time complexity and storage of 
detectors. Moreover, the varying length of the parameter r in 
the rcbvl matching rule can balance specialization and 
generalization in classification systems, which will be the next 
step of our study. How to apply the algorithm to intrusion 
detection systems would be our interesting research direction. 
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Structural Analysis and Progressive Failure Analysis 
of Laminated Composite Joints-Single Pin 

Configuration 

Anna Tomy Manavalan, Dr. R Suresh, C. K. Krishnadasan , Swapna Thomas 


Abstract — A composite material is prepared by joining two 
or more materials of different properties. The joined materials 
work together and give a new material with unique properties. 
Use of composite is provoked by low weight- to- stiffness and 
weight- to- strength ratios. Complex damage behaviour is 
shown by composites due to their anisotropic nature and 
heterogeneity. Thus the detailed analysis of composite 
structures is a formidable task. A joint is a structural 
connection between two or more members intended for load 
transfer. Most structures contain one or more joints. All 
structures contain joints. Joints are one of the greatest 
challenges in the design composite structures because of their 
anisotropic nature and heterogeneity, introduce high local 
stress concentrations. Damage initiation and propagation is the 
greatest concern in understanding the behaviour of bolted 
connections in composites. To support laboratory tests, a finite 
element modelling can be done to support joint design and 
predict propagation of damage. In this present study the 
analysis of a double lap joints are done using continuum shell 
elements and a progressive failure analysis was done using 
Tsai-Hill failure criteria and material stiffness degradation 
mechanism. . Progressive failure analysis was also done to 
determine the mode of failure and showed good correlation with 
the stress results. Primarily two modes of failure observed i.e. 
fibre failure and matrix failure. 

Index Terms — Composite, anisotropic, progressive failure 
analysis, mode of failure. 

I. Introduction 

A composite material is prepared by joining two or more 
materials of different properties. Use of composite is 
provoked by high specific stiffness and high specific strength 
[1]. Improved weight savings, increased fuel efficiency, 
enhanced durability, and superior structural proper-ties make 
composite materials ideal for aerospace applications [2]. 
From the library of elements available composites can be 
modeled using shell elements, continuum shell elements and 
solid elements [3]. 

A joint is a structural connection between two or more 
members intended for load transfer. All structures contain 
joints. Joints are one of the greatest challenges in the design 
composite structures because of their anisotropic nature and 
heterogeneity, introduce high local stress concentrations. 
Thus overall structural capacity is determined by the joints. 
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Several researchers have done studies on the strength of 
single lap composite bolted joints [4-6]. Effect of bolt-hole 
clearance was investigated on single-lap, single-bolt 
composite joints. Increasing clearance was found to result in 
reduced joint stiffness and increased ultimate strain in all 
tested configurations. In single-lap joints, clearance caused 
three-dimensional variations in the stress distribution in the 
laminate. These variations dependent on the lay-up sequence. 
A highly efficient user-defined finite element model and 
empirical expressions were developed to determine the 
bolt-load distribution in large-scale composite structures [7, 
8 ]. 

Damage initiation and propagation is the greatest concern 
in understanding the behavior of bolted connections in 
composites. To support laboratory tests, a finite element 
modeling can be done to support joint design and predict 
propagation of damage. Failure modes and trends in material 
response evaluated to assess the progression of failure in 
composite joints. Various progressive damage mechanisms 
are a) continuum damage mechanics (CDM) or material 
properties/stiffness degradation method (MPDM) all forms 
of damage is represented as local stiffness reduction in 
individual elements. Poisson’s ratios are not degraded and 
only the Young’s moduli and shear modulus are modified for 
a failed element, b) Discrete damage modeling (DDM) in 
which matrix cracks and delamination are explicitly 
introduced into model as displacement discontinuities, which 
they create, c) X-FEM formulations, degrees of freedom are 
added to elements along the crack surface to describe the 
displacement discontinuity, d) Cohesive elements or the 
element failure method (EFM) model formulation was used 
for crack opening [9, 10]. Prediction of the failure carried out 
using various failure criteria such as Hashin S, Tsai-Hill and 
Tsai-Wu failure theory. The results obtained were compared 
and plotted against some available experimental findings 
[11,13]. 

In this present study, validation procedure was carried out to 
determine the accuracy of SC8R continuum shell elements 
and to verify the modeling strategy. The analysis of a double 
lap joints were carried out using continuum shell elements 
and a progressive failure analysis was done using Tsai-Hill 
failure criteria and material stiffness degradation mechanism. 

II. Failure Criteria 

A successful design requires efficient and safe use of 
materials. Composite materials have many mechanical 
characteristics that are different from those of more 
conventional engineering materials. Composite materials are 
inhomogeneous (i.e. constitute non-uniform properties over 
the body) and non-isotropic (orthotropic or more generally 
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anisotropic). An orthotropic body has material properties that 
are different in three mutually perpendicular directions. Have 
three mutually perpendicular planes of material property 
symmetry. Thus the properties depend on orientation at a 
point in the body. Isotropic materials mainly have two 
strength parameters such as normal strength and shear 
strength. Failure is initiated for an isotropic material if any of 
the parameters is greater than the corresponding ultimate 
strengths. 

Theories were developed to compare the state of stress in a 
material to failure criteria. The two failure theories used are 
Tsai-Hill Failure Theory and Tsai-Wu Failure Theory, in 
which the strength parameters ( Xt, Yt, Xc, Yc and S) are 
determined through experiments and stress induced (SI 1, 
S22 and SI 2) are results obtained from Finite element (FE) 
model. 

Xt -Tensile strength in X direction 
Yt - Tensile strength in Y direction 
Xc- Compressive strength in X direction 
Yc- Compressive strength in Y direction 
S- Shear strength 

S 1 1- Stress induced in principal direction 
S22-Stress induced in transverse direction 
SI 2-Shear stress induced 

a) Tsai-Hill Failure Theory 


h 


n2 CC o 2 r< 2 

^11 211^22 . ^22 *12 

X 2 X 2 Y 2 S 2 


( 1 ) 


If I F >1, failure have occurred. 





L=100 mm 


Fig. 1. Configuration 


A. Material Properties 

The materials used for the composite laminate are carbon 
-epoxy, Glass- epoxy and steel. The material properties are 
shown in the Table 1 and Table 2. 

Table 1. Normalized material properties of carbon -epoxy 
and Glass epoxy 


Property 

Carbon-epoxy 

Glass-epoxy 

E l /E t 

16.63 

2.47 

VlT 

0.31 

.229 

Glt/Et 

0.67 

0.25 

x L /x T 

2.03 

1.70 

y l /x t 

0.04 

0.3 

y t /x t 

0.09 

0.45 


Table 2. Material properties of steel 


Property 

Steel 

E l (MPa) 

200000 

VLT 

.31 


b) Tsai-Wu Failure Theory 

h = ^1^11 + F 2 S 22 + F n S h + F 22 S 22 + F 66 S 12 + ^ F 12 S U S 22 
( 2 ) 

If Ip >1, failure have occurred. 


B. Composite layup configuration 

The total thickness of the composite layup is 4mm and the 
layup sequence is as shown in the Table 3. The composite 
plate is symmetric about mid layer thus only half thickness is 
been considered for analysis. Continuum shell elements are 
used to mesh a composite layup. Figure 2 shows the ply stack 
diagram of composite plate. Figure 3 shows the orientation of 
ply with respect to loading direction. 

Layup Sequence is: [90/0/-45 /45/0/90/0- G 1/2 ]s 


Where, 


F \= — + — 
x c 


F n =- 


1 

Cx c 


F 66~ — 

s 2 



F 2 2 - 


1 

Wc 


^ 12 -/ 7 ^ 11^22 

*/ - constant default 

value is zero 


III. Model Configuration 

In this configuration the width of the composite plate is 
changed and the effect of change in width is been 
investigated. The plate is subjected to a load of 14 kN. The 
centre to centre distance between bolts is taken as 100mm 
and the edge distance is 15mm as shown in Figure 1. The 
diameter of bolt is 10mm. The widths of the composite plate 
are 50mm, 40mm, 35mm, 20mm and 15 mm. 


Table 3. Layup configuration of composite plate 


Layer 

No. 

Material 

Thickness 

(mm) 

Fiber 

orientation 

(degree) 

1 

Carbon- epoxy 

0.3 

90 

2 

Carbon- epoxy 

0.3 

0 

3 

Carbon- epoxy 

0.3 

-45 

4 

Carbon- epoxy 

0.3 

45 

5 

Carbon- epoxy 

0.3 

0 

6 

Carbon- epoxy 

0.3 

90 

7 

Glass - epoxy 

0.3 

0 

8 

Carbon- epoxy 

0.3 

90 

9 

Carbon- epoxy 

0.3 

0 

10 

Carbon- epoxy 

0.3 

45 

11 

Carbon- epoxy 

0.3 

-45 

12 

Carbon- epoxy 

0.3 

0 

13 

Carbon- epoxy 

0.3 

90 
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Fig. 2. Ply stack plot of composite plate. 



Fig. 3. Orientation of ply with respect to loading direction 


considered. Geometric and material nonlinearity were 
included in the model. The third and final step in the 
progressive failure analysis is to apply the material 
degradation model to the failed material points. The material 
properties are degraded based upon the damage mode. The 
progressive failure analysis is implemented in Abaqus. The 
process is invoked at each material point of an element to 
evaluate the failure criterion. When failure is detected, the 
degradation model is applied accordingly. In this model, the 
material stiffness E lls E 2 2 and G i2 are instantaneously 
reduced by 1000. 


V. Modeling 

The FEM model is as shown in the Figure 5. The model 
has symmetry about X, Y and Z direction, thus only l/8th of 
the configuration is analysed. The Figure 6 shows the meshed 
model configuration used for analysis in abaqus software. 


IV. Progressive Failure Analysis 

Failure of composite structures is a progressive series of 
events. It often starts as a tiny crack between the fibres and 
matrix. These cracks reduce the stiffness of the composite. 
Capturing stress redistribution is the key to realistic 
simulation of failure in composite structures. Progressive 
failure analysis is done on the same configuration at failure 
load. Progressive failure analysis helps us in knowing the 
mode of failure. The damage in composite structures is 
generally a combination of matrix cracking, fibre breakage in 
tension and compression, and delamination. The first two 
damage modes, matrix cracking and fibre breakage in tension 
and compression are considered. 



Fig. 4. Flow chart showing process of progressive failure 
analysis 

The procedures for predicting the growth of the damage 
path are developed using the progressive failure analysis 
methodology implemented within finite element analysis. 
The progressive failure analysis methodology generally 
consists of three steps Figure 4 shows the flow chart of 
methodology of progressive failure analysis: a) calculating 
the lamina stress { Stresses computed in principal, transverse 
and shear directions} b) Estimating failure index and c) 
degrading the material stiffness in the failed elements to 
represent damage. In the study intra-laminar failure modes 



r*m 

t?T 


Tff 


ir b i 

IT “Ti 


TTT 



Fig. 5. Configuration model in abaqus 



Fig. 6. Configuration meshed model in abaqus 
VI. Results 


A. Validation 

The accuracy of any FE model is dependent on the accuracy 
of the geometry, the type and number of elements used, and 
the material property model. Validation is done in this study 
to check whether the SC8R elements used produce required 
results for the composite layup and to check the modeling 
strategy. For that purpose a problem done by Buket Okutan 
[14] is selected and FE modeling was carried out. The results 
are verified with the results obtained from previous study. 

Geometry: A rectangular composite plate has length L, 
thickness t and width W with a hole of diameter D. The hole 
is at a distance E from the free edge of the plate . The 
configuration of composite plate is shown in Figure 7. In the 
study [14] it was observed that for the [0/90/0] s laminate, 
failure modes were found as bearing mode when the E/D 
ratio is greater than 3. Thus E/D ratio 4 was taken for 
validation purpose. 
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E=4flmm L=8Gmm 


J fej 

4 y 

" H 

r " 



Fig. 7. Geometry of specimen 

Material: Table 4 shows the material properties of 
glass-fiber/epoxy composite. Stacking sequence of 
composite plate is [0/90/0] s 

Load: A tensile load is applied at the hole free edge of the 
plate resisted by the pin. 

FE model and boundary condition: Thus composite plate 
modeled using abaqus software gives required results. 
Composite plate was modeled using continuum shell element 
(SC8R). Widths modeled are 20mm, 30mm, 40mm and 
50mm. Figure 8. shows FE model done for the joint 
configuration. l/4 th of configuration was modeled due to 
symmetry in Y and Z direction. Symmetric boundary 
conditions was been applied. Tensile load is applied at the 
hole free end of composite. In the problem pin was assumed 
to be rigid and thus not modelled. The degrees of freedom 
were arrested in the quarter portion of bolt hole to simulate 
support conditions. 

Table 4 Properties of glass -fiber/epoxy composite 


Longitudinal modulus El (MPa) 

44,000 (+560) 

Transverse modulus E2 (MPa) 

10,500 (±420) 

Shear modulus G12 (MPa) 

388045 (±360) 

Poisson’s ratio v i2 

0.36 

Longitudinal tension Xt (MPa) 

800 (±59) 

Longitudinal compression Xc (MPa) 

350 (±42) 

Transverse tension Yt (MPa) 

50 (±4.35) 

Transverse compression Yc (MPa) 

125 (±9.34) 

Shear strength S (MPa) 

120 (±15.28) 


m. 

»} 

“T 

1 

n 

rm 

ri 


n 


n 

nil 

n 


Fig. 8. FE model done for the joint configuration 


Results: The obtained results (Figure 9) showed the 
variation of bearing strength with w/d ratio The results show 
good correlation and thus the results obtained are validated. 
Thus SC8R elements can be used to model the composite 
layup. In the present study bolts are also modeled to replicate 
the contact property in the real problem. 



Fig. 9. Variation of bearing strength with w/d ratio 

The FEM model is as shown in the Figure 5. The model 
has symmetry about X, Y and Z direction, thus only l/8th of 
the configuration is analysed. The Figure 6 shows the meshed 
model configuration used for analysis in abaqus software. 


B. Stress Results 


Figure 10 shows Sll stress distribution in each layer of 
composite layup of single pin configuration of width 15mm. 
It can be seen from Figure 10 that, stress is concentrated at 
the bolt hole. Thus the circumference of bolt hole is set as the 
region of interest. The Figure 11 -Figure 13 shows the 
variation of stress, i.e. Sll, S22 and S12 around the bolt hole 
in each layer of the composite plate. The Sll, S22, S12 
stresses are evaluated and are used for the calculation of 
failure index. Tsai Hill (TSAIH) and Tsai Wu (TSAIW) 
failure criterion is used to determine the failure index of each 
layer and the layer in which first failure (first ply failure) 
occurred is determined. The component that is responsible 
and the failure load are also determined. Table 5 & Table 6 
shows the failure assessment details using Tsai-Hill and 
Tsai-Wu failure theories. 
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Fig. 10. Stress Sll in each layer of 15mm configuration 
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Fig. 1 1 . Variation of stress Sll around the bolt hole in each 
layer of the composite plate for single pin configuration 




Fig. 12. Variation of stress S22 around the bolt hole in each 
layer of the composite plate for single pin configuration 



Fig. 13. Variation of stress S12 around the bolt hole in each 
layer of the composite plate for single pin configuration 


Table 5. Failure assessment details of configuration by Tsai-Hill failure criteria 


Width 

(mm) 

Load 

(kN) 

Tsai Hill 

failure 

index 

Initial 

Failure 

layer 

Number 

Initial Failure 
layer 

orientation 

Location of failure 
with respect to 
circumference angle 
(degree) 

Failure 

component 

15 

3.5 

3.286 

6 

90 

95.58 

S22 

20 

3.5 

2.161 

6 

90 

95.58 

S22 

35 

3.5 

1.740 

4 

45 

52.23 

Sll 

40 

3.5 

1.726 

4 

45 

52.23 

Sll 

50 

3.5 

1.72 

4 

45 

52.23 

Sll 


Table 6. Failure assessment details of configuration by Tsai-Wu failure criteria 


Width 

(mm) 

Load 

(kN) 

Tsai Wu 
failure 
index 

Initial 

Failure 

layer 

Number 

Initial Failure 
layer 

orientation 

Location of failure 
with respect to 
circumference angle 
(degree) 

Failure 

component 

15 

3.5 

3.421 

6 

90 

95.58 

S22-square 

20 

3.5 

2.270 

6 

90 

95.58 

S22-square 

35 

3.5 

1.991 

4 

45 

52.23 

SI 1-square 

40 

3.5 

1.972 

4 

45 

52.23 

SI 1-square 

50 

3.5 

1.971 

4 

45 

52.23 

SI 1-square 
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Structural Analysis and Progressive Failure Analysis of Laminated Composite Joints-Single Pin Configuration 


C. Progressive Failure Analysis 
a) 15 mm width 

Table 7. Progressive failure analysis of configuration 15mm 


No 


Failure 

load 

(kN) 


Layer 

failing 

Sequence 


Figure 


1 


2 


3 


4 


5 


1.095 


1.095 


1.095 


L6,L1 


L6,L1,L3 


L6,L1,L3, 


L4 


1.095 


L6,L1,L3,L4 

,L5,L2 


1.095 


L6,L1,L3,L4 

,L5,L2,L7 



6 


7 


8 


1.095 


1.095 


1.095 


L6,L1,L3,L4 

,L5,L2,L7 


L6,L1,L3,L4 

,L5,L2,L7 


L6,L1,L3,L4 

,L5,L2,L7 



Failure of laminate is assumed to occur when the element 
degradation reached up to the plate edge. Progressive failure 
analysis results of 20mm configuration are similar to Table 7. 
b) 35mm width 


Table 8 Progressive failure analysis of configuration 35 mm 


No 

Failure 

load 

(kN) 

Layer 

failing 

Sequence 

Figure 

1 

2.03 

L4 







2 

2.03 

L4,L1,L2,L5 

,L6 






3 

2.03 

L4,L1,L2,L5 
,L6, L3,L7 



4 

2.03 

L4,L1,L2,L5 
,L6, L3,L7 



5 

2.03 

L4,L1,L2,L5 
,L6, L3,L7 




Failure of laminate is assumed to occur when the element 

degradation reached up to maximum displacement limit. 

Similar failure results Table 8. are obtained for 40mm and 

50mm configurations. 

VII. Inferences 

• The stresses are concentrated at bolt-hole regions. 

• The location of maximum stress concentration depends on 
the fibre orientation of each layer. 

• The stress plot around the circumference of hole follows a 
particular trend for a fibre orientation. 

• The stress concentration is larger for carbon epoxy layer 
than glass epoxy layer. 


Table 9 describes the mode of failure of composites. There 
are two modes of failure mainly fibre failure and matrix 
failure. 

Table 9 Mechanics of failure of composites 



Stress component 

Mode of failure 

l 

Sll 

Fibre failure 

2 

S22 

Matrix failure 


Table 10 shows the summary of progressive failure 
a nalysis. Figur e 14 . variation o f failure load w ith width. 



0 2 4 6 

w/d ratio 


Fig. 14. Variation of failure load and width of configuration 
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Table 10. Summary of analysis of Single pin configuration 


Width 

(mm) 

Failure 

Load 

(kN) 

Initial 

Failure 

layer 

No: 

Initial 

Failure layer 
orientation 

Location of failure 
with respect to 
circumference 
angle (degree) 

Failure type 

REMARK 

1.5 

1.095 

6 

90 

95.58 

Tensile Failure 

S22 stress is Tensile 

2.0 

1.62 

6 

90 

95.58 

Tensile Failure 

S22 stress is Tensile 

3.5 

2.03 

4 

45 

52.23 

Bearing Failure 

S 1 1 stress is compressive 

4.0 

2.04 

4 

45 

52.23 

Bearing Failure 

SI 1 stress is compressive 

5.0 

2.045 

4 

45 

52.23 

Bearing Failure 

SI 1 stress is compressive 


VIII. Conclusion 

The use of composites in load bearing structures is 
primarily motivated by high specific stiffness and high 
specific strength. The stress distribution depends on the 
layup sequence and materials used. First ply failure occurs 
when the first ply or ply group fails in a multidirectional 
laminate. Progressive failure analysis was carried out to 
determine the mode of failure and showed good correlation 
with the stress results. 

• The stresses are concentrated at bolt-hole regions. 

• The stress concentration is larger for carbon epoxy layer 
than glass epoxy layer. 

• Modes of failure considered are fibre failure and matrix 
failure. 

• When, w/d < 2 : Failure type is tensile failure 

• When, w/d>2 : Failure type is Bearing failure 

• Failure load increases as the width of plate increased. 
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Analyzing Online Products Based on Opinion Mining 
Algorithm and Semantic Keyword Analysis 

C.Suganya 


Abstract — Now-a-days online shopping have become a 
popular shopping method ever since the internet has declared a 
takeover. There are many individuals that are looking for other 
trendy shopping and online shipping is just the fix for that. This 
is the reason why online stores are a grooming business today. 
Online shopping includes buying clothes, gadgets, shoes, 
appliance, or even every day groceries. Online shopping is a way 
of best transaction between money and goods which is done by 
end user without spending a huge time. Every product on online 
shopping website is associated with reviews which represents 
quality of that specific product. For every purchasing the 
consumers are purchasing the product online by reading the 
product review. But reading all these customer reviews before 
buying product, consumes more time. Hence to overcome from 
this issue we propose opinion Mining algorithm and semantic 
analysis technique. But major issues arise when there is 
assignment of fake review given by unidentified user. So this 
system will provide methodology which will allow only those 
customers to give review who have purchased product from that 
website. Others users are not allowed to give review. This will 
decrease the wrong reviewing of product and customer will get 
reliable product. 

Index Terms — Opinion Mining Algorithm, Fake Reviews, 
Sentimental keyword analysis. 


I. Introduction 

A huge number of product reviews are springing up on the 
Internet. From these reviews, customers can obtain first-hand 
assessments of product information and supervision of their 
purchase actions. Meanwhile, manufacturers can obtain 
immediate feedback and opportunities to improve the quality 
of their products in a latest fashion. Thus, mining opinions 
from online reviews has become an increasing urgent activity 
and has attracted a great deal. Customer shopping was a 
concept in which a customer used to buy a product from a mall 
or from shop. And customers were paying money to supplier 
at the time of shopping (purchasing). For traditional 
shopping, customer needs to be directly present at shopping 
place. Also there was no customer review system available to 
describe the quality of product. Customer used to buy product 
on the basis of retailers opinion or suggestion. Sometimes lots 
of retailers give fake feedback to sell their product. But 
Now-a-days internet has done massive amount of evolution in 
shopping. Every activity is getting associated with internet. 

While online shopping customer can buy their needed 
products by sitting at home and using smart phones, laptops, 
computers etc. Here user is doing online payment by means of 
credit card or net banking systems. There is no need of 
customer to be physically gone to shop or at mall for 

C.Suganya, M.Phil Scholar, Dept of Computer Science, Shrimati Indira 
Gandhi College, Trichy, Tamilnadu, India. 


purchasing product and paying money. For choosing good 
quality products, online shopping provides review of each and 
every product given by various purchased customers. 
Normally all customer refers these reviews before buying any 
product. But customers needs more time to read each and 
every review of the particular product given by the other 
purchased customers and then take decision for purchasing 
product or not. As some reviews are maybe positive (good) 
and negative (bad) so customer has to examine each and every 
review before choosing that product. So we have proposed a 
new algorithm to guide customer for choosing a best one 
products. Here we are going to shortlist the positive reviews 
of particular product by using opinion mining algorithm. 

Another problem arises when there is allocation of false 
review to any product. For example, if one mobile phone is 
available for selling on two different major e shopping 
website like X and Z. The Z website can give fake negative 
(bad) feedback (Review) to the phone selling at website X due 
to which purchaser will reject that phone although it is having 
good quality Specifications. To avoid this problem we are 
going to design the mechanism which will accept the review 
only from those customers who really have brought that 
product this processed based on customer purchase bill 
number. This will minimize fake reviewing of products done 
by challengers. 

To overcome from the above problem we implement the 
opinion mining algorithm. Opinion mining is the algorithm of 
determining the approach of the customer with respect to the 
product. In general opinion of the user is most important for 
all organization or to individuals to improve the performance 
of the service. So opinion mining is the algorithm to extract 
(mining) the information about particular things based upon 
the customer reviews. The opinion mining is very interesting 
and important area of research due to the rising web 
technology. The machine learning is used to classify the user 
opinion text. In the section there are different types of 
machine learning techniques. They are opinion mining and 
sentimental keyword analysis. 



II. BACKGROUND AND RELATED WORK 

A. Opinion Mining of Movie Review using Hybrid Method 
of Support Vector Machine and Particle Swarm 
Optimization: 
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Day-to-day, online social media is online discourse where 
people contribute to create content, share it, bookmark it, and 
network at an impressive rate. The faster communication and 
ease of use in social media today is Twitter. The messages on 
Twitter include reviews and opinions on certain topic such as 
movie, book, product, politic, and so on. Based on this 
situation, this research attempts to use the messages of twitter 
to review a movie by using opinion mining or sentiment 
keyword analysis. Opinion mining refers to the application of 
natural language processing, computational linguistics, and 
text mining to classify whether the film is good or not based 
on message opinion. Support Vector Machine (SVM) is 
manage the learning methods that examine data and recognize 
the patterns that are used for classification. This research 
concern on binary classification which is classified into two 
classes. Those classes are positive and negative. The positive 
(+ve) class shows good message opinion; otherwise the 
negative class shows the bad message opinion of certain films. 
This justification is based on the accuracy level of SVM with 
the validation process uses 10-Fold cross verification and 
confusion matrix. The hybrid Partial Swarm Optimization 
(PSO) is used to improve the election of best parameter in 
order to resolve the dual optimization problem. The result 
shows the improvement of accuracy level from 71.87% to 
77%. 

B. Web product ranking using opinion mining: 

Online shopping is becoming increasingly important as more 
manufacturers sell products on the online shopping, and many 
users are using the Internet to communicate and share their 
opinions. However, it is impossible and difficult for 
consumers to read all product reviews. Therefore, it is 
essential to design effective systems to review the pros and 
cons of product characteristics, so that user can quickly find 
their favorable products. In this project, we present a product 
ranking system using opinion mining Algorithm. Users can 
specify product features to view the ranking results of all 
matched products. In this system, we consider three issues 
while analysis product scores: 1) product reviews, 2) product 
popularity, and 3) product release month. Finally, the 
experimental results execute that the system is practical and 
the ranking results are interesting. 

C. Opinion Mining Using Frequent Pattern Growth 

Method from Unstructured Text: 

In the last decade, the area of opinion mining has 
experienced a major expansion because of the increase in 
online unstructured data which are contributed by reviewers 
over various different topics and subjects. These data 
sometimes become important and need for users who want to 
take their decision based on user opinions of actual customer 
of the product. In this paper, we present the FP-growth 
method is used for frequent pattern mining from review 
documents which act as a spine for mining the opinion words 
along with their applicable features by experimental data over 
from two different domains which are very different in their 
nature. 

D. Opinion Mining on Social Media Data: 

Micro blogging (Social Media) has become a very popular 
communication tool among online users in recent years. 


Information is generated and managed through either via 
computer or mobile devices by single person and is consumed 
by many other persons, with most of this customer-generated 
content being textual information. However this trouble is 
challenging because a micro-blog post is usually very short 
and colloquial and oldest opinion mining algorithms do not 
perform well. Therefore, in this paper, we propose a new 
system architecture that can mechanically analyze the 
sentiments of these messages. We combine this system with 
manually annotated data from Social media, for the task of 
sentiment analysis. In this system, machines can learn how to 
automatically extract the set of customer messages (data) 
which contain opinions, filter out non opinion messages and 
conclude their sentiment. Experimental results confirm the 
effectiveness of our system on sentiment analysis in real 
micro blogging applications. 

III. Algorithms 
TEXT MINING ALGORITHM 

Text mining is the study of data contained in natural language 
text. The application of text mining techniques to solve 
business difficulty is called text analytics. Text mining can 
help an organization to derive potentially valuable business 
insight from text-based content such as word documents, 
email and postings on social media streams like FB, Twitter 
and Linkedln. Mining unstructured data through natural 
language processing (NLP), statistical modeling (SM) and 
machine learning techniques (MLT) can be challenging, 
because natural language text is often inconsistent. It contains 
ambiguity caused by contradictory syntax and semantics, 
including slang, language specific to vertical industries and 
age groups, double entendres and irony. 

High-quality data is typically derived through the devising 
of patterns and trends through means statistical pattern 
learning. Text mining frequently involves the process of 
framing the input text (usually parsing, along with the totaling 
of some derived linguistic features and the removal of others, 
and subsequent insertion into a DB), deriving patterns inside 
the structured data, and lastly calculating and interpretation of 
the output. 'High quality' in text mining usually refers to some 
particular combination of relevance and interestingness. 
Typical text mining tasks include text 
clustering, concept/entity extraction, production of granular 
taxonomies, sentimental keyword analysis, document 
summarization, and entity relation modeling. 

OPINION MINING ALGORITHM 

Opinion mining is a kind of natural language processing for 
tracking the feel of the public about a particular 
product. Opinion mining, which is too called sentiment 
analysis, involves structure a system to collect and categorize 
opinions regarding a product. Automated opinion mining 
frequently uses machine, a type of artificial intelligence (AI), 
to mine text for sentiment. Opinion mining can be helpful in 
several ways. It can help marketers estimate the success of an 
advertisement campaign or new product launch, determine 
which version of a product or service are popular and identify 
which demographics like or Unlike particular product 
features. For example, a review on a website (online) might be 
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broadly positive about a digital camera, but be particularly 
negative about how trouble it is. Being able to identify this 
kind of information in a systematic way give the vendor a 
much clearer picture of public opinion(suggestions) than 
surveys or focus groups do, because the data is created by the 
customer. 

There are some challenges in opinion mining. The first word 
that is consider to be positive (+ve) in one circumstances it 
may be consider as a negative (-ve) in some another situation. 
Take the word "long" for instance. If a purchaser said 
a laptop's battery life-time was long, that would be a positive 
(+) opinion. If the purchaser said that the laptop's start-up 
time was too long, however, that would be is a negative (-) 
opinion. These differences mean that an opinion mining 
system trained to gather opinions on some type of product 
(goods) or product feature may not perform very well on 
another. 

Opinion mining is a subtopic of information (data) retrieval 
with considerable research done. Several methods exist to 
find out an author’s view on a topic from natural language 
(NL) textual information. These generally employe some 
form of machine learning approach, and have unreliable 
degrees of effectiveness. 

SENTIMENT ANALYSIS 

It is also known as opinion mining refers to the use of natural 
language processing (NLP), text analysis and computational 
linguistics to identify and extract (mining) subjective 
information in source materials. Sentiment analysis is widely 
applied to reviews, suggestion and social media for a variety 
of applications, ranging from marketing to customer service. 

Classification 

There are two major forms of data analysis that can be used 
for extracting models describing important classes or to 
predict future data trends. These two major forms are as 
follows 

• Classification 

• Prediction 

Classification models predict categorical class and prediction 
models predict continuous valued 

functions. For example, we can build classification model to 
categorize bank loan applications as 

safe or risky, or a prediction models to predict the 
expenditures in dollars of prospective customers on computer 
equipment given their income and occupation. Following are 
the examples of cases where the data analysis (DA) task is 
Classification. 

Examples 

• A bank loan officer wants to analyze the data in order to 
know which purchaser (loan applicant) is risky or which are 
safe. 

• A marketing manager at a company needs to analyze a 
purchaser with a given profile, who will buy a new computer. 

• In both of the above example, a model or classifier is 
constructed to forecast the categorical labels. These labels are 
unsafe or safe for loan application data and sure or not for 
marketing data. 



IV. CONCLUSION 

The concept of this paper is to determine the customer 
reviews of mobile phones at aspect level. System performs the 
Opinion mining on the given reviews and the feature wise 
summarized results generated by the system will be useful for 
the user in taking the decision .Experimental results indicate 
that the ‘opinion mining algorithm’ perform well and has 
achieved the accuracy of 93. 2%. Opinion mining algorithm is 
necessary because nowadays everyone is busy and they don’t 
have a time to read all the positive or negative reviews if 
someone just wants to know about some feature of the 
product. Opinion mining has proved to be helpful in these 
situations as compared to simple opinion mining. 

In future work, these efforts would be done to make some 
enhancements in this technique in such a way that it can 
identify the repeated reviews and classify those user reviews 
only once. 
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Abstract — As an important carrier of the Chinese traditional 
cultural heritage, the ancient villages are gradually 
disappearing. Fortunately, many experts and scholars at home 
and abroad are paying more and more attention on the ancient 
villages’ protection with the help of high-resolution remote 
sensing images. Considering that the surface features in the 
images are so diverse and complex, a new classification and 
recognition algorithm toward the high-resolution remote 
sensing images is proposed in this paper. The proposed 
algorithm is mainly based on the ensemble learning thought. 
With the algorithm, the image is firstly processed with 
multi-scale and multi-feature segmentation, and then the 
spectral and texture features are extracted as the input element 
of the classification and recognition process. Finally, the 
eventual classification and recognition results are decided by the 
ensemble classifier which is constructed by multiple SVM 
(Support Vector Machine) basic classifiers trained with the 
AdaBoost algorithm. The verification experiments indicated 
that the proposed algorithm has an obviously better effect than 
the traditional methods. 

Index Terms — Ancient village protection, High-resolution 
remote sensing image, Multi-scale and multi-feature 
segmentation, Ensemble learning. 

I. Introduction 

The ancient village, the so-called folk Ecological Museum 
[1], is the gene pool of Chinese national culture [2]. However, 
its former prosperity has been gradually disappearing due to 
the passage of time, poor repair and the modern economy 
impacts. Besides that, the ancient villages are usually located 
in the remote environment and rugged terrain. In recent years, 
as the domestic and foreign experts have paid great attention 
to the ancient villages in our country, the ancient village 
protection is becoming increasingly urgent [3-6]. 

At present, the ancient village conservation is mostly 
planned with traditional ways and means, which mainly 
collect and analyze the basic data under current situation from 
the perceptual point of view. Unfortunately, their processing 
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speed and accuracy are both poor. So the ancient village 
conservation planning cannot make a scientific analysis when 
taking a comprehensive consideration of the relevant data’s 
interaction impact. As the ancient village protection is a really 
long-term and dynamic system engineering, it needs a 
dynamic control and adjustment in the whole process. 
Therefore, it requires the departments of planning, design and 
management to promptly grasp the various dynamic data with 
a reflection of the current situation and to serve them as the 
protection and management evidence for the administrator 
departments. So, it is difficult for the traditional method to 
meet the needs of the developing situation. And it is urgent to 
explore new technologies and methods to solve the problems 
encountered in the protection planning and management of 
the ancient villages. 

As many new theories and methods in the domain of pattern 
recognition and artificial intelligence are proposed, as well as 
the continual exploration to the human vision mechanism, 
scholars at home and abroad have made lots of progress in the 
research of high-resolution remote sensing image 
classification and recognition, from the initial pixel-based 
statistical classifications gradually penetrating into the 
intelligent object-oriented automatic classification. For 
instance, Thias-Sanz etc. [7] proposed a bridge detection 
algorithm for the small-format high-resolution panchromatic 
remote images based on the texture feature and geometrical 
model, using neural network to classify the pixels. Although 
effective, it is not suitable for the extraction of the 
large-format and cross-river bridges. Chini etc. [8] did the 
change detection analysis for the artificial structure of the 
high-resolution satellite remote sensing image by the 
classification method based on statistics and neural network. 
It is found that the parallel classification method based on 
neural network classification accuracy is higher than the 
former. Melgani and Bruzzone etc. [9] used the SVM 
algorithm to classify the hyperspectral remote sensing image 
data respectively considering one to one, one-to-many and 
other cases. The experimental results showed that all of the 
classification accuracy, stability and robustness under the 
SVM method are better than both the RBF neural network and 
the K-nearest neighbor classification method. 

In order to make the computer classify and recognize the 
remote sensing images better in line with the human visual 
information processing mechanism and ways of thinking, 
researchers have attempted and explored to introduce the 
thought of expert system, visual model, classifier combination 
and so on. Mathieu etc. [10] completed the feature 
classification for the New Zealand region’s villages and towns 
through the object-level analysis of the remote image. Yi etc. 
introduced the words package model into the remote image as 
a guide, then analyzed the semantic relationships between the 
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visual words according the PLSA model, and further 
completed the classification and identification of the 
high-resolution remote image. Mo etc. [11] took the 
high-resolution remote sensing image data IKONOS as a 
major data source, and then automatically extracted the land 
cover and land use information in the rural-urban area of 
Zhuzhou City, China, through the multi-scale segmentation 
and the object-oriented image analysis method based on fuzzy 
logic classification. 

However, the existing target recognition methods of the 
high-resolution remote sensing images can only be used to 
detect one certain unnatural object, such as the roads, 
buildings, airports, bridges, ports, oil depots, ships and so on. 
In other words, a lack of traditional methods can universally 
detect different kinds of target features in the images. To 
better meet the practical application demands and expand the 
development prospect of the high-resolution remote image, 
some research focuses are emerging, such as that: how to 
build a highly efficient intelligent classification recognition 
algorithm, how to comprehensively consider the abundant 
information features in the image, how to convert the visual 
cognition to computer rules, how to further effectively 
analyze different kinds of target information in the remote 
image and so on. 



(c) 


Figure 1 The original experimental images. 


II. The Preprocessing of the High-resolution Remote 
Sensing Image on Ancient Villages 

As some force majeure, including satellite disturbance, 
atmospheric conditions, sensors etc., are inevitably produced 
in the remote image capture process, it tends to generate 
random errors which would results into the image degradation 
in the aspects of intensity, frequency and space. The 
degradation effect mainly contains the contrast decrease, edge 
blur and geometry distortion, which would affect the analysis 
and decision-making in the subsequent application of the 
images. The information contained in the high-resolution 
remote images is especially abundant. If they are not 
preprocessed suitably, it may bring a lot of problems such as a 
large number of the generated false features, the lost real 
characteristics, the feature information errors and so on. To 
improve the SNR (Signal Noise Ratio) of the images and 
make sure the exact extraction and identification of the remote 
image target, the restore operation toward the image is 
necessary, which is namely the preprocessing of the remote 
image. 

In this paper, three high-resolution remote sensing images 
will be selected as the experimental research materials, as 
shown in Figure 1 . 

Experimental image I: a 0.16m resolution of low altitude 
UAV (unmanned aerial vehicle) image in Taiping Town, 
Lushan County, China, captured in April 20, 2013, as shown 
in Figure 2.1 (a). 



(a) 


Experimental image II: a GoogleEarth image of Gong Jia 
Wan, Huaihua City, Hunan Province, China, captured in 
September 2009, with the angle of view 1.04km and a size of 
1315x679, as shown in Figure 1 (b). 

Experimental image III: a GoogleEarth image of a paddy 
fields village, Chenxi County, Hunan Province, captured in 
September 2009, with the angle of view 774 meters and a size 
of 1341x687, as shown in Figure 1 (c). 


III. The Classification with Ensemble Learning 
Method 

According to the diversity of the ground objects and the 
complexity of the space distribution in the ancient villages 
high-resolution remote sensing images, as well as considering 
the limitations of a single classification algorithm and the 
complementarity between different classification methods, a 
multi-classifier fusion method based on the ensemble learning 
thought can be introduced to classify and recognize the 
high-resolution remote sensing images on the ancient villages. 
The proposed method would improve the quality of 
classification and recognition, and further optimize the 
results. 

A. Ensemble learning 

Ensemble Learning is a machine learning paradigm which 
firstly study on the same problem by a limited number of 
learning devices, and then integrate the outputs of each 
learning device following some certain rules. It conforms to 
the human thinking habit as well as significantly improve the 
generalization ability of the system algorithm, which has been 
widely used in many fields, such as speech recognition, text 
classification, intrusion detection, image retrieval and so 
on[12]. 

According to the type of training algorithm, it can be 
divided into isomorphic integration and heterogeneous 
integration [13, 14]. Homogeneous ensemble learning is 
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based on a single learner, which generates different basic 
learner according to the construction strategy. However, 
heterogeneous ensemble learning is based on different 
learning algorithms, using the difference between different 
learning algorithms to obtain different basic learners. Due to 
the inherent mechanism of the learning algorithm, it is 
difficult to provide a reasonable and unified measurement 
analysis of integration effect, and the use of different learning 
algorithms will result in an increase in the overall complexity 
of the integrated learning. Therefore, most of the current 
ensemble learning researches are focusing on the isomorphic 
ensemble learning. 

Multi-classifier ensemble learning is a typical application 
of the ensemble learning on classification problems, which 
improves the classification performance of a single classifier 
by fusing the predictive output of several homogeneous or 
heterogeneous classifiers. According to the ensemble learning 
system, it can be known that multi-classifier integration is 
usually composed of two stages [12]: the base classifier 
construction (learning stage) and base classifiers combination 
(application stage), whose basic framework is as shown in 
Figure 2. 



Figure 2 Basic framework of multiple classifier ensembles. 


B. The principle of the algorithm 

SVM (Support Vector Machine) was originally developed 
from solving two types of classification problems, whose 
essence is taking the easily mistaken training examples as a 
breakthrough to solve the problem. The main idea of 
classification is taking the "hard to distinguish and easily 
mistaken" sample as support point of classification surface, 
and then optimize the classification discriminant surface to 
make the biggest distance of support surface of positive and 
negative categories. As for the limited training sample data in 
high dimensional feature space, the classifier also has strong 
generalization ability while even a small sample is selected as 
a support vector to design the classifier. The structure of the 
algorithm is automatic optimal generation, which can reduce 
the test time and effectively solve the small sample problem. 

As the SVM has a good performance, its application is 
gradually extended to the multi-class classification, which can 
be combined with several two-class classification SVM under 
certain criterion [13]. But there are still some problems to be 
solved in the rules of the combination, such as that the 
classification performance is not as outstanding as two-class 
problems solving. Besides that, the implementation of the 
multi-class classification is more complex. However, the 
architecture of multi classifier ensemble provides a powerful 
theoretical idea to the improvement of classification and 
generalization performance for SVM in multi-class 
classification problems. 

AdaBoost algorithm is one of the most popular types 
among the Boosting algorithm clusters, and it is very simple 
to construct a member classifier with it. It also can obtain a 


very high precision when doing the integrated classification 
decision [14]. Considering the limitation of the experimental 
sample set in this paper, the multi-classifier fusion 
classification recognition algorithm based on the base 
classifier of SVM and AdaBoost construction method will be 
used to recognize and classify the elements of the ancient 
village of high-resolution remote sensing image. 

The algorithm is based on the iterative idea of AdaBoost 
algorithm to train the SVM based classifier with the RBF as 
the kernel function: assumed a weight distribution D t on the 
training set X tr . In the tt h iteration, assumed that each training 
session is assigned a weight of D t (x .) . According to the weight 
distribution of D t , randomly select a sample xf from 
the X tr and take the sample (the input of SVM) as the base 
learning algorithm to train a base classifier C t and calculate 
the classification error e t . Use this error to measure the 
performance of the base classifier C t and update the weight 
distribution of the training samples. After a certain iteration 
cycle or when a predetermined precision is achieved, T base 
classifiers would be obtained. Carry on the fusion operation 
by weighted majority voting rule and then finally a strong 
classifier with a better decision performance is obtained. 


C. Algorithm description 

Given training sample setX fr ={(v.,y.)}^ 1 and the iteration 
number (weak classifier number) T. Given initial weight 
D ] ={w 1; . = y^} , ( i = l,2,---,N ) to each training individual 
(x.,y.) in the training set X tr . 

1) According to the weight distribution of D t , conduct N 
times random sampling with replacement from the training set 
X tr and a new training set is gotten as Xf = {(jcf°, y! 0 )}^ ; 

2) Take X ( f as the input of a given base classification 
algorithm RBF-SVM and train it to get a base classifier C t ; 

3) Calculate the classification error e t of base 
classifier C t on the training sample set; 

£l = P(C,(x i )*y i ) = f j w ti (1) 

i=\ 

4) If £-,>0.5, set D l+l ={w (t+w =Y N } and go to step 1; 

Otherwise, reset the weight of the RBF-SVM based 
classifier a t ; 

a, =tln[(l-£,)/£,] (2) 

5) Update the weight distribution of training samples D t 


D t + i = K +1 )/ = 


w n exp(— y t C t (x t )) 


( 3 ) 


where Z t is the normalized factor normalization factor, 
which makes D t+l a probability distribution 


z ? = Yj w tj exp (-ajjCfXj)) (4) 

j = i 

6) If t = T or when the specified accuracy is achieved, 
output the final strong classifier C(x) according to the 
majority voting fusion rule 

T 

C(x) = argmax(^a,C,(x)) (5) 

t=\ 
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IV. Experimental Results and Analysis 

To analyze the performance of the classification and 
recognition algorithm proposed in this paper (the 
multi-classifier ensemble classification algorithm), the 
nearest neighbor and neural networks classification methods 
are selected as the comparative references in the comparative 
experiments, which are performed based on ENVI5.0 
platform. 

Algorithm validation and accuracy assessment are the 
fundamental problems of the remote sensing data processing 
and classification, which are the important steps when 
comparing different classification algorithms. When 
evaluating the remote sensing images’ classification accuracy, 
the confusion Matrix is the most commonly used. It is a 
specific measurement which compares both the classification 
result and the actual predicted value. By comparing the actual 
classification and the predicted classification results of the 
surface area, the relationship between the actual class and 
predicted class can be recorded. 

If given N zones, with the output class C, so the confusion 
matrix M is: 

M={m ij ] (6) 

where the m ;/ represents the total number of the real 
class i in which the area is recognized as a category j . The 
greater the value of the diagonal elements in the confusion 
matrix is, the higher the reliability of the classification results 
are. Similarly, the greater the value of the non-diagonal 
elements in the confusion matrix is, the more serious the error 
classification is. According to it, the main indicators of the 
classification accuracy include production accuracy, user 
accuracy, the overall accuracy and Kappa coefficient [7] . In 
this paper, the Kappa coefficient which is more 
comprehensive to reflect the overall accuracy is selected as an 
evaluation metric. A greater Kappa coefficient indicates a 
higher classification accuracy of the corresponding 
classification methods. 

c c c c 

N Yu m a - ZCEX * LA) 

Kappa = — — c '~‘ c — (7) 

i =\ 7=1 7=1 

A. The first group of experiments 

The experimental image I, the Taiping town picture, is 
firstly adopted in the experiment. Three small parts are further 
selected to be processed with different algorithms. And the 
classification and recognition effects are respectively shown 
in Figure 3, Figure 4, and Figure 5. 



(a) 



(c) 


(ff) building (jj) road farmland woodlands 

river ® bridge ^ other 

Figure 3. The classification and recognition results of the 1st 
Taiping town image: (a) Classification and recognition algorithm 

proposed in this paper; (b) Neural network classification; (c) 
Nearest neighbor classification. 

For the Figure 3, the algorithm parameters of each method 
should be set as follows. 

(1) The parameters of the proposed algorithm in this paper 
should be set as: 

T : 30; N : 20; Gamma : 2.5; Penalty Parameter. 600. 

(2) The parameters of the neural network classification 
algorithm should be set as: 

Number of Hidden Layers : 1; Number of Training 
Iterations'. 600. 

(3) The parameters of the nearest neighbor classification 
algorithm should be set as: 

Neighbors'. 6; Threshold'. 5.0. 

As the modern roads are similar to the modern buildings on 
material texture, it usually leads to a misclassification error 
between the roads and buildings. Besides that, part of road 
would be misclassified as bridge as a result of the connection 
error between them. In the experiment, it is found that the 
nearest neighbor classification algorithm misclassifies the 
part of farmland as building, and the modern building as road, 
the phenomenon of which is relatively serious. And some 
flood land would be classified as road because the similarity 
of the flood land and road. Similarly, with the neural network 
classification method, the classification error between the 
farmland and building is relatively large, and the error also 
exists among the part of road, building and farmland. In 
comparison, the classification and recognition algorithm 
proposed in this paper can avoid the above misclassification 
to a large extent. According to the Kappa coefficient as shown 
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in Table 1, the classification and recognition algorithm 
proposed in this paper has an obvious improvement compared 
with the neural network method and the nearest neighbor 
method. So the proposed algorithm has a better classification 
performance. 



(c) 


Figure 4. The classification and recognition results of the 2nd 
Taiping town image: (a) Classification and recognition algorithm 

proposed in this paper; (b) Neural network classification; (c) 
Nearest neighbor classification. 

For the Figure 4, the algorithm parameters of each method 
should be set as follows. 

(1) The parameters of the proposed algorithm in this paper 
should be set as: 

T: 30; N : 20; Gamma: 0.3; Penalty Parameter. 500. 

(2) The parameters of the neural network classification 
algorithm should be set as: 

Number of Hidden Layers : 1; Number of Training 
Iterations'. 600. 

(3) The parameters of the nearest neighbor classification 
algorithm should be set as: 

Neighbors'. 3; Threshold'. 3.5. 

The experimental results indicate that the classification and 


recognition algorithm proposed in this paper effectively avoid 
the following phenomenon: the nearest neighbor 

classification method misclassifies the river as woodland and 
the jungle is misclassified as building with the neural network 
method. As shown in Table 1, the Kappa value of the 
proposed method is obviously higher than the contrast 
method. So with this, the classification accuracy is relatively 
improved, the overall recognition results are close to the 
actual distribution. 



(c) 


Figure 5. The classification and recognition results of the 3rd 
Taiping town image: (a) Classification and recognition algorithm 

proposed in this paper; (b) Neural network classification; (c) 
Nearest neighbor classification. 

For the Figure 5, the algorithm parameters of each method 
should be set as follows. 

(1) The parameters of the proposed algorithm in this paper 
should be set as: 

T. 30; N: 20; Gamma: 0.02; Penalty Parameter. 500. 

(2) The parameters of the neural network classification 
algorithm should be set as: 

Number of Hidden Layers : 1; Number of Training 
Iterations'. 600. 

(3) The parameters of the nearest neighbor classification 


23 


www.erpublication.org 






Research on Classification and Recognition Algorithm of the High -resolution Remote Sensing Image on Chinese 

Ancient Villages 


algorithm should be set as: 

Neighbors'. 3; Threshold : 2.7. 

In Figure 5, the housing distribution is clustered, 
meanwhile the farmland and forest land occupy a relatively 
large proportion of the image. But with the traditional nearest 
neighbor classification method, a lot of mistakes would 
appear. When neural network is used, the classification 
precision can be relatively improved, but farmland and 
woodland areas would still be misrecognized as building. 
Fortunately, with the classification and recognition algorithm 
proposed in this paper, the classification result is relatively 
close to the actual distribution. 


Table 1. The Kappa coefficient of the Taiping Town ROI images 


Classification recognition 
method 

Figure 

3 

Figure 

4 

Figure 

5 

Nearest neighbor classification 

0.492 

0.476 

0.406 

Neural network classification 

0.623 

0.774 

0.720 

Classification and recognition 
algorithm proposed in this paper 

0.820 

0.875 

0.872 


B. The second group of experiments 

For the Figure 6, the algorithm parameters of each method 
should be set as follows. 

(1) The parameters of the proposed algorithm in this paper 
should be set as: 

T: 30; N : 25; Gamma : 0.03; Penalty Parameter. 600. 

(2) The parameters of the neural network classification 
algorithm should be set as: 

Number of Hidden Layers'. 1; Number of Training 
Iterations'. 500. 

(3) The parameters of the nearest neighbor classification 
algorithm should be set as: 

Neighbors'. 3; Threshold'. 1.2. 

In this group of experiments, it is relatively serious that the 
nearest neighbor classification method would misclassify the 
farmland as building. Although the neural network 
classification method has an improvement to a certain extent, 
the classification accuracy is still low when comparing with 
the classified recognition method in this paper. As the Kappa 
coefficient in Table 2 showed that the proposed algorithm’s 
classification accuracy had improved 49.1% and 21% when 
compared to the nearest neighbor classification method and 
the neural network classification method, respectively. 


Table 2 The Kappa coefficient of Gong Jia Wan experimental image 



Nearest 

neighbor 

classification 

Neural network 
classification 

Classification and 
recognition algorithm 
proposed in this paper 

Figure 

6 

0.413 

0.695 

0.904 


From the Figure 6, it is serious that the nearest neighbor 
classification method would misclassify the farmland as 
building. Although the neural network classification 
recognition method has an improvement to a certain extent 
than the former, the classification accuracy is still very low 


when comparing to the proposed classified recognition 
method in this paper. According to the Kappa coefficient in 
Table 2, the proposed algorithm in this paper has a 
significantly improved performance compared with the 
former two. 



(c) 


Figure 6 The classification and recognition results of the Gong Jia 
Wan experimental image: (a) Classification and recognition 
algorithm proposed in this paper; (b) Neural network classification; 
(c) Nearest neighbor classification. 

C. The third group of the experiments 

For the Figure 7, the algorithm parameters of each method 
should be set as follows. 

(1) The parameters of the proposed algorithm in this paper 
should be set as: 

T: 30; N: 25; Gamma : 0.04; Penalty Parameter. 700. 

(2) The parameters of the neural network classification 
algorithm should be set as: 

Number of Hidden Layers'. 1; Number of Training 
Iterations'. 500. 

(3) The parameters of the nearest neighbor classification 
algorithm should be set as: 

Neighbors'. 3; Threshold'. 1.7. 
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(c) 


Figure 7. The classification and recognition results of the large 
paddy fields experimental image: (a) Classification and recognition 
algorithm proposed in this paper; (b) Neural network classification; 
(c) Nearest neighbor classification. 

In this set of experiments, the results of all the three 
classification methods are relatively close to the actual 
distribution of the feature images to a certain extent. However, 
the nearest neighbor method would mistakenly classify some 
farmlands as buildings, and a large number of buildings have 
been recognized as farmlands. For the neural network 
classification method, it is a little serious that the buildings are 
identified as farmlands. 

As the housing distribution is gathered, country road is 
narrow, and the distribution without rules, coupled with the 
shooting angle influence, some roads are usually blocked by 
architecture and jungle occlusion. Table 3 shows the Kappa 
coefficient of the proposed classification and recognition 
algorithm proposed in this paper has increased by 37.9% 
compared to the nearest neighbor classification method, 
significantly improving a lot compared to the previous two 
methods. 


Table 3 The Kappa coefficient of the large paddy field experimental 


image 


Nearest 

neighbor 

classification 

Neural 

network 

classification 

Classification and 
recognition 
algorithm proposed 
in this paper 

Figure 7 

0.512 

0.780 

0.891 


V. 

Conclusion 



In this paper, the classification and recognition algorithms 
of the high-resolution remote sensing image on Chinese 
ancient villages are analyzed. As the surface features in the 
remote sensing images are so diverse and complex, the 
traditional algorithms such as the Neural Network and 
Nearest-neighbor algorithm can hardly universally detect 
different kinds of target objects. To better meet these practical 
application demands, a new method based on the ensemble 
learning and multi-classifier fusion in the pattern recognition 
fields is proposed in this paper. The main analysis thought has 
been along with the procedure as "remote sensing image 
preprocessing - image segmentation - classification and 
recognition". Finally, after a series of classification and 
recognition comparative experiments toward three original 
images, the proposed algorithm based on the multi-classifier 
ensemble learning thought has an obviously better effect than 
the Neural Network and the Nearest-neighbor classification 
methods, according to the Kappa coefficient shown in Table 
1, 2 and 3. 
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Pakistani Punjabi Men’s Summer “Shalwar kameez” 
of upper class having income of US$ 5000/Month. 
Best practices and Spinning norms 

Mr. Allah dad 


Abstract — Cotton fabrics having important properties like 
moisture absorbent, soft handle and feel etc. And because of 
such properties it is the most dominant preference of the people 
of Pakistan as a fibre in their apparel because of long summer 
season .Textile sector plays a vital role in the economic growth of 
the country. In Pakistan, Punjab is most populated province and 
most of the people related to business or agriculture prefer to 
wear “Shalwar kammez” in summer. In this research paper the 
premium quality of woven cotton fabric was analyzed in terms 
of characteristics and properties for high income group people 
of having income around US $5000/ month located in various 
parts of the province. By studying and analyzing the norms of 
spinning and characteristics of fabric through various lab test, 
best practices were developed for the fabric. The norms were 
related to the demand of fabric and of customer like strength, 
wearing properties, handle , feel etc with respect to right fibre 
type, yarn type and the fabric construction. 

Index Terms — Cotton Fabric, Shalwar Kammez, feel, fibre. 


I. INTRODUCTION 

The economy of Pakistan is directly related to the textile 
sector of the country, especially cotton containing products. 
The major portion of value added products and exports relates 
with the textile goods which is around 55% of total export 
value. On the basis of demand of cotton products Locally ( 
inside the country) many of textile producers started their 
work on premium quality of cotton fiber products (such as the 
Extra Long Staple from the US) ,( Emeka Osakwe, May 18 
2009). For premium quality of textile products , fibre length is 
considered the foremost characteristic (Richard, 2012), and 
has an one of most important character of premium quality 
yarn production. (Cui et al., 2009). To measure the length of 
the fibre which is known as staple length is to be measured by 
the fibro graph suggested by Hertel in 1940, consider most 
reliable measurement source of fibre length. (Hertel, 1940). 
Along this many other high tech measuring instrument such as 
USTER HVI, USTER AFIS PRO the staple length can be 
measured easily. The most lengths are characterized as the 
mean length (ML), the short fiber content (SFC), the upper 
quartile length (UQL), and so on. These parameters plays an 
important role in the manufacturing of staple yarn which is 
cotton based that directly approaches towards the quality of 
the yarn that ultimately leads towards the quality of the 
fabric. (Lin, Xing, Oxenham, & Yu, 2012). In Pakistan short 
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length fiber is to be grown which accounts as short staple 
length and medium length of fibre such as 1/8 of inch. There 
are various types of cotton varieties planted and available in 
Pakistan which used in various textile goods such as CIM 496 
in Punjab and NIAB-78, and CRIS-134 in Sindh. The staple is 
short and medium (although it mostly medium). The reason 
found for short and medium length of cotton fibre is due to the 
commonness insects and mealy bugs along climatic 
conditions( Emeka Osakwe, May 18 2009). The special 
characteristics of the woven fabric considered many 
important properties such as light weight, soft feel, cover 
factor, high drape, and well-designed look, ecofriendly for 
the premium quality of woven fabrics.. ( (Swamy 2002), very 
restricted research has been carried out on the finest yarn and 
fabrics for certain culture and region but there is always need 
to address the important manufacturing processes of yarn and 
fabrics such as durability of the yarn in terms of tensile 
strength, bursting strength abrasion resistance. (Uttam and 
Gangwar 2006). The durability and strength of fabric not only 
rely on strength of the yarn used but also many other 
important factor for the manufacturing of premium quality of 
the woven fabrics (Morton 1949; Realff et al. 1997). It was 
concluded that there is correlation exists between yarn 
strength and the fabric structure (Essam 1929). The density of 
warp and weft yarns influence the flexural rigidity and 
modulus of the fabric (Cooper 1965; Gere2003; Guthrie et al. 
1954; Lord and Mohamed 1982; Montgomery 2005; Nash 
1972; Peirce 1930;Tuma 1993; uksekkaya et al. 2008). As the 
linear density of the yarn increases, the above two parameters 
also increases. Cotton fabrics of various constructions like 
plain, 1/3 twill, and 4-end irregular sateen have been made 
and the effect of fabric weave on different fabric properties 
such as mechanical properties, bending, and creasing 
behavior and appearance of the fabric are studied before .( 
Ashis Kumar Samanta, Asis Mukhopadhyay, Madhusudan M. 
Bhagwat &Tapas Ranjan Kar 2015) Ureyen and Kadoglu 
(2006) .A linear multiple regression method for the 
estimation of qualitative characteristics of yarn. They found 
that, in addition to fiber properties, yarn count, twist, and 
roving properties had considerable effects on the yarn 
properties. ( Strumillo, Cyniak, Czekalski, and 
Jackowski(2007) determined the functional dependencies of 
selected fundamental parameters of cotton yarn quality such 
as tenacity, elongation, unevenness, hairiness, and the number 
of faults on the linear density of yarn. And with the increase in 
linear density (tex), tenacity, elongation, and hairiness 
increases, and the number of faults decreases. ( El-Mogahzy 
(2006) . 
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The objective of the design of any form in terms of fabric or a 
garment is to be defined by many properties and behavior of 
the product. The solution of the problem solved by meeting 
the requirements of aesthetics and physical demand. ( Giingor 
Ba§er (2008). In this research paper the aesthetic demand 
concern color and feel but the physical demand account the 
fiber staple length, strength coefficient of variation , similarly 
for yarns the tensile strength, thick/thin places, evenness, and 
for the fabric the design parameter weave type, strength, 
density, warp and weft finesses. The design of the product 
woven fabric is to be set by assigning values to a set of 
parameters each denoting a property of the product. The 
values can be color, shape, behavior of the product. 

II. Methodology 

Pakistan contains diversify cultures on the basis of region. In 
Pakistan Punjab is most populated province. The Punjab is 
Pakistan's second largest province at 79,284 sq miles and is 
the most developed, most populous, and most prosperous 
province of Pakistan., in general shalwar kammez, Paghri, 
lacha and dhoti are the common costumes of Punjabi people( 
Sarah Veach Katy Williamson Texas State University). The 
Punjab province contains two major regions one is central 
Punjab and second is southern Punjab. The major income 
source related to agriculture and its allied industries. In order 
to know the preference of Men’s Apparel in Punjab, a small 
structured questionnaire that containing closed ended 
question is to be conducted by keeping in view the uniqueness 
of the research objective and research concern. The questions 
covers important observation directly related to the research 
questions. In given questionnaire the following question were 
asked about 

1) occupation 

2) Your monthly income 

3) Please indicate which items of clothing you prefer to wear 
in the summer during your business hours? 

4) Which part of the province do you belong to? 

5) What qualities do you look for in your clothing? 

6) Which colors do you prefer to wear during the summer 

7) Which fabric type do you prefer to wear in the summer? 

In conducting the survey the questionnaire is to be developed 
by considering mixed approach method. Typically 
simplifying occurs in mix method of qualitative and 
qualitative research. (Curtis, Gesler, Smith, and Washburn 
(2000) and Onwuegbuzie and Leech (2005c, 2007) 
Quantitative researchers tend to make “statistical” 
generalizations, which involve generalizing findings and 
inferences from a representative statistical sample to the 
population from which the sample was drawn. In contrast, 
many qualitative researchers, although not all, tend to make 
“analytic” generalizations (Miles & Huberman, 1994), which 
are “applied to wider theory on the basis of how selected cases 
Tit’ with general constructs” (Curtis et al., 2000, p. 1002); or 
they make generalizations that involve case-to-case transfer 
(Firestone, 1993; Kennedy, 1979). In other words, statistical 
generalizability refers to representativeness (i.e., some form 
of universal generalizability), whereas analytic 
generalizability and case-to-case transfer relate to conceptual 
power (Miles & Huberman, 1994). Therefore, the process of 
sampling is important to both quantitative and qualitative 


research. Unfortunately, a false dichotomy appears to prevail 
with respect to sampling schemes available to quantitative 
and qualitative researchers. As noted by Onwuegbuzie and 
Leech (2005b), random sampling tends to be associated with 
quantitative research, whereas non-random sampling 
typically is linked to qualitative research. However, choice of 
sampling class (i.e., random vs. non-random) should be based 
on the type of generalization of interest (i.e., statistical vs. 
analytic).. The sampling scheme was selected Simple in which 
Every individual in the sampling frame (i.e., desired 
population) has an equal and independent chance of being 
chosen for the study and Homogeneous in which Choosing 
settings, groups, and/or individuals based on similar or 
specific characteristics. The sampling case is to be selected 
Random Purposeful because of Selecting random cases from 
the sampling frame and randomly choosing a desired number 
of individuals to participate in the study. The choice of sample 
size is as important as is the choice of sampling scheme 
because it also determines the extent to which the researcher 
can make statistical and/or analytic generalizations. The 
sample size for analyzing the preferences of the respondents 
were 30 in number (e.g., Charles & Mertler, 2002; Creswell, 
2002; Gall, Borg, & Gall, 1996; Gay & Airasian, 2003; 
McMillan & Schumacher, 2001). 

The result of the surveyed questionnaire from a sample size of 
30 in number which provides the evidence of Apparel 
selection that was Shalwar Kameez for the Punjabi people 
along their desired characteristics. The result is given in table 
1 : 



Scale 0 -0.9 =05 Respondents 

1- 1.9= 05 Respondents 

2- 2.9 = 05 Respondents 

3- 3.9 = 05 Respondents 

4- 4.9 = 05 Respondents 

5- 5.9 = 05 Respondents 

6- 6.9 = 05 Respondents 

The statistical data collected from the survey which is shown 
in above graphical representation. According to the result 
most of the respondents of equal distributed in central or 
southern Punjab province. The sample size of 30 
respondents were in the favour of Men Apparel shalwar 
Kammez as apparel in summer season , those belongs from 
upper class of having US$ 4000 TO 5000/ month. The 
selection of cotton fabrics due to moisture absorbency and 
light weight , natural fibre , smooth feel and having 
comfortable wearing properties( A.J turner, 2009 Natural and 
Man made fiber). Cotton has been used to produce yarns and 
fabrics from time immemorial. With the advent of 
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technologies and increased knowledge on cotton fibres, man 
has become able to control the properties of yarns and fabrics 
through proper selection of cottons and machinery parameters 
(Arindam Basu South India Textile Research Association, 
P.B. No. 3205, Coimbatore, Tamil Nadu 641014, India 
(Received 22 November 2007 ; final version received 3 May 
2000). A large number of scientists have worked on the 
predictability of yarn properties based on fibre characteristics 
such as length, strength, fineness, inter-fibre friction, etc. 
Hunter (2004) has made a review of 200 articles related to this 
subject. Cheng and Adams (1995), Guha, Chattopadhyay, and 
Jayadeva (2001) and Jayadeva, Gupta, and Chattopadhyay 
(2003) made attempts to utilise the latest tools such as 
artificial neural network to predict the yarn quality on the 
basis of cotton-fibre quality. All the works reported good 
correlations between fibre properties and yarn properties. 
Kumar, Nishkam, and Ishtiaque (2005) studied the effect of 
inter-fibre friction on yarn quality. All of these studies were 
based on one cotton at a time, i.e. yarn was produced from 
single cotton, and relationships were derived. 


III. Sampling and Test. 

For the determining the characteristics of cotton woven fabric 
which is (shalwar Kameez) and the spinning norms, starting 
from the raw material, cotton fibre was selected of three types 
with respect to the origin and staple length. 

a) Medium staple length ( Pakistani cotton variety of staple 
length 1/8 inch with blend of long staple length fibre of 
Egyptian cotton) 

b) Long staple length ( American cotton of staple length 0.9 to 
1.25 inch) 

c) Extra long staple ( Egyptian cotton of staple length 1 to 2.2 
inch ) 

d) compact combed and carded Yarn. 

e) Cotton fabric swatches 

The tests were performed under controlled laboratory 
condition following the standards given by American 
Standard for Textile Material( ASTM) 

The list of ASTM standards are: 

• Pre conditioning of specimen for moisture equilibrium 
:ASTMD1776 

Cotton fiber classification and testing time 4 hr relative 
humidity and temperature 21 + 1 [70 + 2] 65 + 5 (Standard 
Practice for Conditioning and Testing Textiles 1 Designation: 
D1776/D1776M - 16) 

• Tear Strength 
ASTM D-1424 

A slit is centrally precut in a test specimen held between two 
clamps and the specimen is torn through a fixed distance. The 
resistance to tearing is in part factored into the scale reading 
of the instrument and is computed from this reading and the 
pendulum capacity. Precondition the specimens by bringing 
them to approximate moisture equilibrium in the standard 
atmosphere for preconditioning textiles as directed in Practice 
D1776, sampling unit, take five specimens from the machine 
direction and five specimens from the cross-machine 
direction, Consider the long direction of the specimen as the 


direction of test. (ASTM Standard Test Method for Tearing 
Strength of Fabrics by Falling-Pendulum (Elmendorf-Type) 
Designation: D1424 - 09) 

• Tensile Strength 
ASTMD-5034 

This test method describes procedures for carrying out fabric 
grab tensile tests using two types of specimens and three 
alternative types of testing machines. For reporting, use the 
following identification system of specific specimen and 
machine Tensile Testing Machine, of the CRE, CRL, or CRT 
type conforming to Specification with respect to force 
indication, working range, capacity, and elongation indicator, 
and designed for operation at a speed of 300 +/- 10 mm/min 
(12 +/- 0.5 in./min); or, a variable speed drive, change gears, 
or interchangeable weights as required to obtain the 20 +/- 3-s 
time-to-break. (ASTM Standard Test Method Breaking 
Strength and Elongation of Textile Fabrics (Grab Test 
Designation: D5034 - 09 (Reapproved 2013). 

• Seam Strength 
ASTMD-1683 

This test method measures the sewn seam strength in woven 
fabrics by applying a force perpendicular to the sewn seam, 
test specimens, cut five specimens 350 6 3 mm [14 6 0.1 in.] 
by 100 6 3 mm [4 6 0.1 in.] with their long dimensions parallel 
either to the warp (machine) direction or to the filling (cross) 
direction, or cut specimens for testing from both directions if 
required. Fold the specimen 100 +/- 3 mm [4 +/- 0.1 in.] from 
one end with the fold parallel to the short direction of the 
fabric. After seaming, cut the fold open. The test specimen 
should contain a seam approximately 100 +/- 3 mm [4 +/- 0.1 
in.] from one end. Each test specimen will contain sufficient 
material for one seamed and one fabric test. (Standard Test 
Method for Failure in Sewn Seams of Woven Apparel 
Fabrics, Designation: D1683/D1683M - 11a) 

• Color fastness to washing 
AATCC-61 2A 

• Breaking strength and elongation 
ASTMD-D5034 

Cut each specimen 100 +/- 1 mm (4 +/- 0.05 in.) 
wide by at least 150 mm (6 in.) long with the long 

dimension parallel to the direction of testing and force 
application. (Standard Test Method for Breaking Strength 
and Elongation of Textile Fabrics (ASTM Standard Grab Test 
Designation: D5034 - 09) 


IV. Results 


A. Norms of Fiber and yarn 


Yam 

fineness 

Cotton 

variety 

Staple 

length 

mm 

Uniformity 
ratio % 
Standard 
at 5% 

Fineness 

as 

Micronaire 

Tenancity 

(gm/tex) 

CV 

% 

60/s Ne 

Egyptian 

Cotton 

100% 

25.6 

10.44 

2.9 

26.3 

12.3 

56/s Ne 

Pakistani 

cotton 

With 

Egyptian 

cotton 

50:50 

ratio 

18.4 

12.22 

4 

20 

15.9 
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B. Spinning Norms with respect count Lea Strength 
Product 


Yam 

fineness 

Lea Dimensions (120 yard) 

CV% 

U% 

(unevenness) 


Normal (Nm * 
kg) 

Nm=metric 

system 

Premium cotton 
having staple 
length(Nm*kg) 



60/s Ne 

1522 


12 

22 

56/s Ne 

1300 


14 

19 


C. Norms of yarn with respect to twist 


Yam finesses 

Minimum Twist per inch 

Maximum twist per inch 

60/s Ne 

26 

38 

56/s Ne 

22 

32 


D. Spinning Norms of the Cotton yarn ( thick thin places 
and Grade of yarn) 


am 

fineness 

Thick 

places/+50% 

/ 

km length 

Thin 
places/ 
-50%/ 
km length 

Neps /+200%/ 
km length 

Grade of yam 

60/s Ne 

22 

4 

44 

A 

56/s Ne 

100 

53 

70 

A- 


A. Fabric quality parameters and their norms. 


Fabric 

construction 

Breaking 

strength 

dbf) 

ASTMD-1424 

Elongation % 

Tensile 

Strength 

dbf) 

ASTMD-5034 

Seam 

Strength 

ASTM-1683 

Color 

Fastness to washing 
ATCC-61 2 A 

Warp 

Weft 

Warp 

Weft 

Warp 

lbf 

weft 

lbf 

Warp 

lbf 

Weft 

lbf 

shade 

change 

Staining on 
cotton 

60*60/124*104 

2.5 

1.89 

12 

14 

86 

58 

51 

45 

4 

4 

56*56/110*100 

2 

0.98 

8 

8 

82 

51 

45 

42 

4 

5 


CONCLUSION 

The quality of the fabrics were analysed on the basis of all 
important spinning norms . By analyzing the results which is 
obtained from two different fiber and fabric construction, the 
fiber and yarn quality of the fabric construction 
60*60/124*104 with all technical parameters is considered as 
best practices for shalwar kammez for the people of Pakistan. 
In Punjab pro vince. The result shows that by using Egyptian 
cotton having long staple length and finesses range from 60 to 
64 in single ply considered as suitable for fabric quality . 
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Abstract — This research is aimed to study the effect of using 
insert materials on the strength of the diffusion welding joints 
between commercial-grade Aluminum(AL200) and carbon steel 
(S20C). 

In this experiment as well as clarifying the mutual relation 
between the insert materials and welding conditions, the relation 
between the strength of a joint and the formation of compound 
layer also investigated. Besides the study of metallurgical effects 
of insert materials using by accurate microstructure 
examination, it was found that the insert materials when used in 
a thin layered form, had important role in enhancing of the 
diffusion to weld materials which differ in their melting point. 
The result also showed the deformation rate of the joints is 
decreased when using insert materials with low yield stress at 
constant welding conditions. 

Index Terms — welding, diffusion, joint, temperature 

I. Introduction 

The diffusion welding is one of the methods used to joint two 
different metals, jointed permanently, there are many 
advantages to this method as the welding in this way does not 
leave a mark on the two pieces welding and to joint the 
region[l], and the connecting force between the two pieces 
are very large and because it is caused by the spread of atoms 
as a result of raising the temperature, its entered invasive in 
many areas of manufacturing and production of precision 
instruments that need to be great sensitivity, such as electrical 
transistors industry as well as small parts of electronic 
computers [2]. 

Several experiments have been conducted in this area was 
where the welding different metals multi-most famously made 
by Kaukato group of Japanese were diffusion welding 
between pure aluminum AL100 and Mild steel using 
intermediate compounds such as Ti-Ni alloy, these 
experiments and studies included to identify some properties 
mechanical such as tensile strength and shock when you 
change different welding conditions as these studies showed 
that the effectiveness of the use of intermediate compounds 
depends on the quality of the thermal treatment performed on 
the mother of two pieces and on the nature of welding where if 
welding was one dip or more, and the results also showed that 
the conditions of welding (temperature, time, pressure) used 
an active role and a large effect on the mechanical strength of 
the welding connection[3]. 

In this research, conducted a study of the possibility of 
obtaining the maximum tensile strength of the connection 
welding through the use of intermediate compounds are ( Ni, 
2024 alloy, Ag), study and investigate the correlation between 
the output of the use of intermediate compounds and 
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conditions of the welding process impact, as well as shed light 
on the relationship between the strength of the link and the 
formation of metal compounds described during obtain 
diffusion. 


II. THEORETICAL BASIS 

The study of diffusion in important minerals in practice it 
happens as a result of the relative movement of atoms, as the 
atom moves from place to another within the crystal lattice of 
the metal and oscillate about its balanced, altering atoms site 
is the cause of diffusion in the material [4], and deployment is 
happening inside grained only, but on a grain boundary 
surfaces free, proving laboratory experiences that spread the 
granular border faster than it is inside the beloved and spread 
on the free surfaces faster of the two, and is attributable to the 
lack of agglutination granular border installation and surfaces 
free [2], and spread via the free surfaces and borders granular 
important because the grain boundary occupy much space and 
be a network covering the mineral sample, and the diffusion 
coefficient depends on the composition and temperature as 
the following equation describes the diffusion process[5]. 

D = D 0 exp (-Q/RT) (1) 

whereas : 

D: Propagation coefficient 

Do: frequency coefficient 

Q: The activation energy for the diffusion 

R: gas constant 

T: Temperature 

Practical experience has shown that different metals are not 
spread evenly rate, element which melts at low thermal grade 
spreads faster, for example, in the alpha Brass (a mixture of 
copper and zinc) zinc atoms spread faster than the copper 
atoms, but in a couple of diffusible composed of copper and 
nickel, the atoms copper spread faster [6], and as a result this 
is happening dilation and contraction of the surface interval 
expansion that occurs in the vertical direction on the surface 
interval (interface) have not disabled the contraction and 
expansion winning the direction parallel to the surface shall 
be disabled by a pair diffusion that does not spread it occurs 
parts Vicu the first part in the event stresses Hdih and the 
other in the case of stresses Pressure where displace atoms, 
and lead these stresses to the formation of thermoplastic 
(plastic deformation) [7], and accompanies this configuration 
are quasi-grained (sub grain) and Recrystallization and the 
growth of the grains. 

There are several ways to spread are: 

A. Interstitial diffusion: 

Corn moving in this way from the site of Benny to the nearest 
site interface another without the occurrence of permanent 
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original atoms change (matrix atom), that this transition jump 
or be accompanied by spillover or deformation (distortion) 
and this distortion in the crystal is a barrier to proliferation, 
and this kind of commonly spread in alloys in which the atom 
occupies interface locations, it is a distortion and a small 
deployment does not need to voids (vacancies) [2] . 

And it expresses its interface diffusion coefficient as follows: 

D= a a 2 ZV exp (-AF/RT) (2) 

whereas: 

^ : Geometric factor 

a: Constant of crystalline 

Z: Number associated with 

V : Frequency 

F: Energy needed to deploy 

B. Ring diffusion: 

The self-diffusion in metals and alloys are not caused by the 
direct exchange of atoms, because this method leads to the 
formation of large deformations in the crystal inappropriate in 
terms of energy stimulant, so the spread gets another way is 
ring diffusion where rotation of several atoms at the same 
time, this type adequate to explain some unusual phenomena 
coefficient diffusion in metals with body-centered crystal 
structure (B.C.C) [5]. 

C. Vacancy mechanism: 

A winning spread because corn moved to the empty sites in 
crystals as the distortion in this case a little, so the energy that 
few are also needed, and this method is the most predominant 
in metals and alloys with different crystalline structures 
(BCC, FCC, HCP), and Vacancy mechanism also increases 
with rising temperature [8]. 

III. THE PRACTICAL SIDE 

A. The method of the experiment 

Chemical composition of the samples used are shown in table 
{1} and the basic materials used are AL200, S20C. The 
geometry of samples welding is cylindrical dimensions of the 
form (14x20) mm for the purpose of tensile test and ( 20x28) 
mm) for the purpose of Impact test and (10x14) mm for the 
purpose of the crystal structure using a microscope 
examination, the welding device diffusion, it was use 
measuring crawling device creep of metals for this purpose 
has been on the tensile stress to the stress put pressure on the 
welding samples, and use of electric resistance furnace for the 
purpose of the samples heated with thermocouple to set the 
temperature of welding, is welding in a vicious room air so as 
to prevent air leakage into the welding strictly prohibited , 
also used the hydraulic piston for use in welding some 
samples, and the time of the welding process to be determined 
with the arrival of the temperature to the desired degree and 
pre-set, also used a range of different intermediate materials, 
which were clarified thickness and the amount of purity in 
table{2}, where installed temperature and pressure and 
welding time with the change of use of the type of 
intermediate materials, and note the impact on the durability 
of the mechanical link and this is the second part of the 
practical aspect related to mechanical tests after the welding 
process the samples. 


Table{l}:The chemical compositions of samples used 
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Table{2}: The thickness and purity 
interfaces used 


Insert material 

Ni 

2024aHoy 

Ag 

Ti 

Thickness (mm) 

0.01 

0.05 

0.03 

1.03 

Purity (%) 

>99.5 


>99.9 

>99.5 


B. Mechanical tests 

Conducted tensile tests using tensile testing device as 
calculated value of the tensile strength when the speed of the 
top cross head device and the amount of 0.5mm / min, and 
calculated the maximum tensile strength of the connection 
welding direction vertical to the line connecting the two 
samples connected, and the test was conducted using Charpy 
device to see how much carry link welding stresses shock by 
using the weight of 5kg so put piece in the examination to be 
the edge of a rock fall on the welding area of the welded 
samples during the examination, And make microscopic 
examination of samples welded near the dividing line between 
the welded samples at temperatures of various welding 
(500,520,540,600)°C, having been assigned to the hot and 
conducted by gradually smoothing (1 10,420,500, 1200)°C 
and then refined by Alalmunya then Manifesting process 
conducted by Olnayatl 4% for a period of 6 seconds after the 
sample is washed with alcohol and dried to become a sample 
ready for microscopy and imaging. 

IV. RESULTS AND DISCUSSION 

When using metal sheets of alloy aluminum 2024 A1 and raise 
the temperature initially at 513°C at a rate of 2.5°C/min, and 
to check heating of the change in the amount of the liquid 
phase, which arise from heating it after heating the ingot and 
auditing at different temperatures for 30 min. is galvanization 
in water iced, and then measure the amount of liquid phase. 
This can be seen in fig.(l), which shows the relationship 
between the amount of the liquid phase VL and temperature 
where we note that the size of 3.5VL of the liquid phase 
occurs at a temperature of 600°C, also note that the number of 
crystalline granules which is calculated in a manner calculate 
the distance at least rapidly with temperature rise. Based on 
this result, the use of Ni, AL2024, A g, Ti as material 
interfaces between S20C temperature welding fixed 600°C 
and pressure welding equal to 20.065kg/m, is illustrated in 
fig. (2), which shows that when using Ni as a feedstock, the 
robustness of the link obtained be so that breakage occurs in 
the sample during the examination Turning works and using 
these four interfaces materials note that the use of Ag gives 
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maximum durability of the link and then come (2024 alloy) 
while giving Ti less solid connection to welding. 


compression welding equal to 20.045 kg/mm. And it 
increases the tensile strength increased slightly when 
increasing the temperature of 640°C to 650°C. 
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Fig.(l): The effect welding temperature on the 
amount of illustrates the hquid phase and the 
number of grains of the alloy 
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Fig. (4): The effect of the temperature of the welding on the 
tensile strength of the joint (AL-Ni) at a different pressure 
welding values (Pw) 
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Fig. (2): the effect of interface materials on the 
tensile strength of the wielding connection when 
welding conditions (pressure, temperature, time) 


by fig. (3) can be observed ultrastructure next to the welding 
area for Ni with S20C obtained in this way, as well as the 
presence of a small concavity and convexity of the boundary 
between the two pieces and that such a movement of the line 
dividing believed it improved the robustness of the connected. 



Fig. (3): Welding zone connection Ni-S20C when 
wielding conditions 

As fig. (4) explains the effect of temperature welding T w the 
screwing of the link AL-Ni in the temperature 610°C, 650°C, 
the tensile strength of the link be equal to 4.7 kg/mm 2 using 


The exact composition of the compounds interfaces formed at 
the boundary between the AL, Ni for the welding can be seen 
in fig. (5), when raising the temperature of 610°C to 650°C 
gets growth volumetric of these compounds, as shown in 

fig.(5). 
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Fig. (5): Welding of the connection (AL-Ni) when the 
conditions ’welding for different pressure, temperature. 

And fig. (6) shows the effect of the temperature of welding on 
the strength tensile connection welding AL - S20C note of the 
figure the amount of durability welding pieces conducted by 
thermal treatment (annealing) at 600°C for 30min, we find 
that the strength of the joint be close to the durability that 
happen by breakage of a piece examination during the boot 
process. We find that durability in connection welding 
AL-S20C be larger as you can in the temperature 520°C, with 
an increased temperature of welding as strong connected 
Welding, note that the feedstock used is alloy 2024 in the 
party AL and Ni in the party S20C, this result represents an 
indication of the fact that that T w while increasing growth in 
the volumetric crystal boundary phases formed, which leads 
to increased Brittleness. 

In fig. (7), which illustrates the impact resistance of the 
connection welding (AL-S20C), we find at a temperature of 
600°C The impact resistance of a piece examination was 
almost equal to the amount which the fraction obtained during 
surface operation, and at a temperature of 520°C we get the 
maximum impact resistance manner and with increasing 
temperature less impact resistance. 
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Fig. (7): The effect of the welding temperature on the 
impact resistance of the connection (AL- S20C) 



4. high temperature of welding T w increases volumetric 
growth interfaces used in diffusion welding, also lead to 
reduced tensile durability and impact resistance. 
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V. CONCLUSIONS 

1. durability connection welding depends on the type of 
intra-feedstock used in the welding process. 

2. during diffusion welding for Ni-S20C must reach the 
temperature T w at least above the temperature of any AL 
turned over 723°C. 

3. conclude from mechanical tests that breakage occurs 
between the interface and S20C in each case of the four 
cases, from above, we can prove that the temperature 
used 600°C be inappropriate to get spread between the 
interface used and the S20C, if the temperature is raised 
over the diffusion will accelerate but gets deformation in 
a piece of aluminum which leads to sacrifice the most 
important feature is characterized by a trickle-down 
welding deformation of metals curb pain. In diffusion 
welding of metals, which vary greatly in degrees of 
melting as is the case in our experience, and when you 
do not get the durability required using the pre one layer 
of the feedstock with the metal with a high degree of 
fusion gives tremendous results, when welding nickel 
with S20C at a temperature 850 °C, the durability shall 
be equal to 772kg/mm 2 which is greater than the value 
of the metal is much pain. 
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Comparative study of data transfers using Wi-Fi 

modules 

Gaurav Khadse, Ninad Adhav 


Abstract — The need of wireless transfer of data from a 
microcontroller to an Android device or a desktop PC can be 
fulfilled by the use of a Bluetooth or Wi-Fi module. A token or a 
few bytes of data can be transferred using any Bluetooth 
module. However, a problem arises when the size of data 
increases to a few megabytes. A low cost Bluetooth module does 
not buttress the high data rate and switching to an efficient and 
faster Bluetooth module is an expensive alternative. A cost 
effective solution to this is using an easily available Wi-Fi 
module which is comparatively cheaper. 

This paper describes some important steps for setting up a 
Wi-Fi module, sending large amount of data using the Wi-Fi 
module, and comparing the speeds of the same module with 
different microcontrollers. 

Index Terms — Arduino Mega, Arduino Uno, ESP8266, 
Teensy 3.2, Data Integrity, Baud Rate. 


I. INTRODUCTION 

Data transfer from a microcontroller to an Android device or 
a desktop PC is easy when a newbie in the field of electronics 
has to send a token or a byte of data. The available wireless 
communication modules like Bluetooth and Wi-Fi make it 
possible to transfer data in minimal time. A newbie can setup 
a Bluetooth module in a few minutes by writing a simple 
sketch, and can send data over a Bluetooth module. Setting 
up a Wi-Fi module is cumbersome as compared to a 
Bluetooth module and it requires profound knowledge of 
embedded systems and networking. 

When the amount of data to be transferred on an Android or 
desktop PC from an external SD card increases, the transfer 
speed becomes a challenge when it comes to a wireless 
transfer. Bluetooth speed deteriorates when the module is 
used with a high baud rate. The possible solutions can be 
using a high performance Bluetooth module or using Wi-Fi 
module. The high performance Bluetooth module has a large 
buffer size and a capability to handle high data rates. 
However, it is expensive as compared to a Wi-Fi module 
which is sufficient to satisfy the data transfer purpose given 
the overhead of setting it up. 

To send a large amount of data from the sd card may take 
several minutes. To lessen the transfer time is also one of the 
challenges. A few tests have been performed and the data 
transfer is made fast and easy using one of the cheapest 
components available in the market. 
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II. LITERATURE SURVEY 

The transfer rate would depend upon a lot of factors. Out of 
all these, the two main factors would be: 

1 . How fast the sd card is being read and 

2. How fast and efficiently the data is being handled by 

the Bluetooth or Wi-Fi module. 

The second factor has been studied thoroughly and numerous 
experiments were performed on different Bluetooth and 
Wi-Fi modules to have a practically achievable transfer rate. 
This was reinforced by 100% data integrity. 

The Bluetooth modules that had been taken into 
consideration were classic Bluetooth modules like HC05, 
RN41, RN42, andBT33. HC05, RN41, andRN42 don’t have 
enough buffer size to handle large amount of data 
continuously. Hence the flow control was implemented using 
the RTS and CTS pins provided by these Bluetooth modules. 
But, after implementing the flow control, the transfer rate 
started to deteriorate drastically. These Bluetooth modules 
took approximately 1 Minute to send 1Mb of data. The BT33 
seemed more efficient than the rest of the Bluetooth modules 
and therefore, the BT33 module, after having a proper 
implementation of flow control, was able to achieve higher 
transfer rate than the HC05, RN41, and RN42. But BT33 was 
not cost-effective. Also, only one instance of the Bluetooth 
module could be connected at a time with an Android device 
or with a Desktop PC. This could be overcome using a Wi-Fi 
module. 

Therefore, Wi-Fi module was chosen and research began 
with finding a suitable Wi-Fi module which is reliable as well 
as cost-effective. After going through several Wi-Fi modules, 
the Espressif ESP8266-12e and Huzzah CC3000 were 
shortlisted. 

After a lot of extensive comparison based on the following 
performance, compatibility, supporting forums, and cost of 
the modules, we selected the ESP8266-12e as the module for 
our tests. 

ESP8266-2e has a programmable microcontroller but we 
preferred to test it with an external controller as we had other 
time-constraining tasks to perform using an external 
controller. 


The following table outlines the basic structure of CC3000 
and ESP 8266. 
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Wi-Fi 

chip/module 

CC3000 

ESP8266 

Wi-Fi Standards 

802.11 b/g 

802.11 b/g/n 

Packets 

TCP and UDP 

TCP and UDP 

Modes 

Client and 

Server 

Client and 

Server 

Concurrent 

Sockets 

4 

5 

Access Point 

Modes 

No 

P2P, Soft-AP 

Size 

26.22 x 40.45 x 
2.95mm 

24 x 16mm 

Interface 

SPI 

TTL Serial 

Encryption 

Up to 

WPA2-PSK 

Up to 

WPA2-PSK 

Sleep Current 

- 

<10 uA 

Transmit Current 

350mA 

215 mA (typ.) 

Receive Current 

- 

~60 mA 

Digital Pins 

0 

9 

Analog Pins 

0 

1 

Other Pins 

0 

0 (E variant adds 
more) 

Programmable 

Microcontroller 

No 

Yes 

Cost (US Dollars) 

$34.95 

$3.37-6.95 


Table 1. Comparison of Wi-Fi modules 


III. Proposed Architecture 

The esp8266-12e was tested with Arduino UNO and Arduino 
Mega in server mode. It was also tested with Teensy 3.2 in 
server as well as client mode. The mode is said to be a server 
mode when the esp8266-12e acts as a server and an external 
mobile or a desktop PC acts as a client. The mode is said to be 
a client mode when the esp8266-12e acts as a client. In this 
mode, the python server has to be created on a desktop PC. 
General Configurations: 

Wi-Fi Module Name: ESP8266-12e 
Firmware used: 

esp_iot_sdk_1.5.4 (New Firmware) 
esp_iot_sdk_0.9.4 (Old Firmware) 

Software Used: 

Arduino IDE , Teensyduino, Python IDLE. 
Hardware used: 

Arduino Uno, Arduino Mega, Teensy 3.2 

Connection: 


ESP8266 



Fig 1 . Connecting ESP8266 with Arduino UNO 



Fig 2. Connecting ESP8266 with Arduino Mega 



Common Steps for Data Transfer and Data Logging : 

• Power up the device and upload the code. 

• When checking data transfer on Android Phone - 

> Start Admin Hands App -> Hosts 

> Create a new connection by selecting “+” 
sign at the bottom left. 

> Fill in the Fields with the IP Address and 
Port Number of the Wi-Fi module and 
select Telnet as a medium of transfer. 

• When a code is complied, connect the phone’s Wi-Fi 

to the Esp8266 SSID. 

• Wait for 20 Seconds from the time of compilation 

(Delay provided manually in program) and then 
select the new connection which has just been 
created by you on the App. If everything is correct 
you shall see a Blank Black screen on the App. 

• If connection is not established properly you will see 

the following message on the same black screen. 
“Connecting to “IP Address”....” 

• After the successful connection and after 10 seconds 

you will see the data transfer. 

• When checking data transfer on desktop PC/laptop : 

> Open Putty 

> Select Logging (Under Session on the left) 
-> All Session Output -> Browse the folder 
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where you want the Logged File to be 
saved. 

> Click Session -> Fill in IP Address and Port 
Number of the module to be connected to 
-> Connection Type - Telnet 

> Under the Saved Sessions bar give a name 
and click on Save to save all the above 
settings so as to use them in the future 
again instead of doing all the above steps 
every time. 

> Once code is complied, start the Putty 
Session and after 20 Seconds you should 
see the Data Transfer. And after entire file 
is transferred, close putty and browse to the 
file where data is logged to get the file. File 
will not be created until Putty is closed. 

Following AT command set was used in the tests: 

AT - Test module response 
AT+GMR - Module Information 
AT+IPR=2000000 - Change the baud rate of the module 
- (Old Firmware) 

AT+UART_DEF=2000000,8,1,0,1 - Change the baud 
rate of module - (New Firmware) 

AT+CWMODE=2 - Change mode of the module 1-3 
(1 -client, 2-Server, 3- Server+client) 

AT+CIPMUX=1 - Accept multiple connections 
(0-Single , 1- Multiple) 

AT+CIPSERVER=1,80 - Set module as the server 
AT+CIPAP=" 192.168.4.1” - Set the IP Address of the 
module in Server mode 

AT+CIPSTA="192.168.5.1” - Set IP address of 
module in client mode 

AT+CWSAP= M DRILL M , ’ password ”, 3, 2 - Set the 
SSID and PW of the module 

AT+CIPSTAMAC? - Get current MAC address of the 
module in station mode 

AT+CIPAPMAC? - Get current MAC Address of the 
module in SoftAP mode. 

AT+CIPMODE=l - Put module in Transparent 
Transmission Mode. 

AT+CIPSTART=l,”TCP","ip address”, "port” 

(when AT+CIPMUX=1) //doesn’t work on new f/w (This 
command is for the client mode) 

AT+CIPSTART=”TCP”,"ip address",” port” (when 
AT +CIPMUX=0) 

AT+CIFSR - Get Ip address of module as the client 
AT+CIPSEND=0,2048 - Send data packets 


IV. Experiments 

1. ESP8266 with Arduino Uno (Server Mode) 

The ESP8266 was first tested with Arduino UNO. The 
hardware-serial was used as it has dedicated Tx, Rx pins 
which results in faster communication than the 
software-serial. 

The Following sketch was uploaded in Arduino UNO: 

# include <SPI.h> 

# include <SD.h> 

Mefine TIMEO UT 5000 // mS 
Mefine LED 13 
char buf[512]; 
const int chipSelect = 10; 


File myFile; 
char invar = 0; 
char invarl = 0; 

//- 

void setup( ) 

{ 

pinMode(LED, OUTPUT); 

Serial begin( 4000000 ); 

/* SD CARD IN IT- */ 

//Serial.print(” Initializing SD card... "); 
pinMode(SS, OUTPUT); 
if (!SD.begin(chipSelect)) { 
return; 

i 

SendCommand("AT", "Ready"); 
SendCommand("AT+CWMODE=2 ", "OK"); 

Serial.println( "AT+ CWSAP=\"TFM_DRILL\",\" passwords", 
3,2"); 

//SendCommand(" AT+CIFSR", "OK"); 

SendCommand( "AT+ CIPMUX= 1 ", "OK"); 

SendCommand( "AT+ CIPSERVER=1, 80 ", "OK"); 

/* FIND IMPORT STRING */ 

String IncomingString- ""; 
boolean StringReady = false; 
delay (2 5000); 

StringReady = true; 
if ( StringReady ){ 


/* READ FILE IF FOUND IMPORT 

STRING */ 


myFile = SD.open( "DATA50.TXT"); 
if (myFile) 

{ 

while ( myFile. available( ) ) 

{ 

Serial. println( "AT+ CIPSEND=0,512 "); 
while( 1 ){ 

if(Serial.find("> ")){ 
invar - 1; 

i 

if (invar == 1) break; 

} 

myFile. read(buf5 12 ); 

Serial.write(buf5 12 ); 
while( 1 ){ 

if(Serial.find( "OK") ){ 
invarl - 1; 

} 

if (invarl == 1) 
break; 

i 

invar = 0; 
invarl = 0; 

i 

myFile. close(); 

} ' 
else 

{ 

Serial. printing error opening test.txt"); 

} 

} 

} 

/* LOOP */ 

void loop( ){ 

} 
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/* FUNCTIONS */ 

boolean SendCommand( String cmd, String ack){ 
Serial.println(cmd); // Send "AT+ " command to module 
if (lechoFind(ack)) // timed out waiting for ack string 
return true; // ack blank or ack found 

i 

boolean echoFind( String keyword ){ 
byte current_char = 0; 
byte keyword_length = keyworddength(); 
long deadline = millis() + TIMEOUT; 
while(millis() < deadline ){ 
if ( Serial. available( ) ){ 
char ch = Serial. read( ); 

//Serial. write( ch ); 
if (ch -- keyword [ current _char] ) 
if ( ++ current _char == keyword_length){ 

//Serial.println( ); 
return true; 

i 

1 

i 

return false; // Timed out 

} 

/* END of functions */ 

For this setup, 1Mb file took 35 Seconds to transfer. The 
limitation of this setup was the memory size. Due to this we 
could only send buffered data of maximum 512 characters. 
This caused added cycles of the loop in the program which 
increased the processing time which ultimately resulted in 
fairly slow transfer speed. 

2. ESP8266 with Arduino Mega (Server Mode): 

To solve the above limitation in the case of Arduino Uno, we 
switched to Arduino Mega which has more memory than the 
Uno. The Mega could handle large buffer size and so could 
increase the transfer speed. 

We made some changes in the code which was used in the 
Arduino UNO. The buffer size which was 512 bytes in the 
Arduino UNO was made 2048 bytes in the Arduino Mega. 
The Mega, could handle large amount of data in a single go. 
Also, the Serial baud rate was increased to 5000000 from 
4000000. 

For this setup, 1Mb file took 15 Seconds to transfer. Transfer 
time is significantly improved as now we are sending data in 
block sizes of 2048 which is the maximum sending size for 
the ESP8266. 

3. ESP8266 with Teensy 3.2 (Server Mode) 

Finally we tested the ESP8266 with the Teesny 3.2 which is 
way faster than the Arduino Uno and Mega. Also, it has more 
memory than the other two. 

In this setup, 1Mb data took 5 Seconds to transfer. This was 
the fastest time achieved amongst the 3 methods. We are 
sending data at the maximum baud rate of 5 Million and 
maximum buffer size of 2048. 

4. ESP8266 with Teensy (Client Mode- Python 

Server) 


The connection for this setup is same as the server mode. The 
only difference here is that we configure the ESP module in 
client mode and send data through the Transparent 
Transmission mode of the module. For the server we use a 
script which is written in Python. 


Python Server Sketch: 

import socket # Import socket module 

import time 

s = socket. socket() # Create a socket object 

host = '192.168.0.119' # Get local machine name 
port = 80 # Reserve a port for your service. 

s.bind((host, port)) # Bind to the port 
f = open( 'Got_File. txt ', 'wb j 

s. listen( 5) # Now wait for client connection, 

while True: 

c, addr = s.accept() # Establish connection with client. 

print ('Got connection from', addr) 

start_time = time.time() 

print ("Receiving Data from Client... ") 

l = c.recv(1024) 

while (l): 

print ("Receiving... ") 
f.write(l) 
ttprint (l) 
l = c.recv(1024) 
f.close() 

#s. shutdown( socket. SHUT_WR ) 
print ("Done Receiving ") 

print("— %s seconds — " % (time.time() - startjtime )) 

c.send( "Thank you for connecting ") 

c.close() 


For this setup, 1Mb file took 3 minutes to transfer. Work is 
still underway on this setup. We need to reduce the transfer 
times as the current results are not ideal for us. So we are still 
sticking with the results we got using the “ESP with teensy - 
Server Mode”. 


V. RESULTS 

Following are the final and best results that we have achieved 
from all scenarios and setups. 

Arduino Uno and ESP : 

Server Mode: 


Baud Rate: 4 Million 

Buffer Size: 512 

File Size: 1Mb 

Time: 35 Seconds 

File Size: 8Mb 

Time: 2 Minutes 30 

Seconds 


Data Integrity: 100% 
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Arduino Mega and ESP: 

Server Mode: 


Baud Rate: 5 Million 

Buffer Size: 2048 

File Size: 1Mb 

Time: 15 Seconds 

File Size: 8Mb 

Time: 2 Minutes 

Data Integrity: 100% 


Teensy and ESP: 

Server Mode: 


Baud Rate: 5 Million 

Buffer Size: 2048 

File Size: 1Mb 

Time: 4 Seconds 

File Size: 8Mb 

Time: 35 Seconds 

Data Integrity: 100% 


Teensy and ESP: 

Client Mode : Python Server: Transparent Transmission 
Mode 


Cellular networks: operations, challenges, and future design, 2012, pp. 
19-24 

[8] Notes on the inexpensive ESP8266 WiFi module. Available: 
http://www.labradoc.eom/i/follower/p/notes-esp8266 

[9] The easy way to build Internet of Things. Available: 
http://iot-playground.com/ 

[10] Connecting the ESP8266 to an Arduino. Available: 

http://www.teomaragakis.com/hardware/electronics/how-to-connect-a 
n-esp8266-to-an-arduino-uno/ 

[11] Data Integrity, definition and introduction. Available: 
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Baud Rate: 2 Million 

Buffer Size: No 

Buffer 


File Size: 1Mb 

Time: 3 Minutes 

File Size: 8Mb 

Time: Not Tested 

Data Integrity: 100% 


VI. CONCLUSION 

Performing several experiments on ESP8266 and various 
Microcontrollers, we came to a conclusion that the large 
amount of data can be transferred using the ESP8266 Wi-Fi 
module along with the teensy 3.2 in a small amount of time. 
Thus, the need of the wireless transfer of data from an 
external sd card to the Android device or the desktop PC can 
be fulfilled by the use of Wi-Fi module as a communication 
medium. 
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An investigation of the durability and compressive 
strength of air cured microconcretes containing 
different types of aggregates 

Apostolos S. Marinos, John A. Marinos 


Abstract — This paper investigates how the type of aggregates 
affects the compressive strength of concrete and also its 
durability against chloride penetration and carbonation. 
Microconcretes (concrete without coarse aggregates) contained 
different types of sand (sand from crushed limestone, river silica 
sand) were produced. Water-to-cement (w/cm) ratios of 0.35, 
0.4 and 0.5 were used in production of microconcretes. The 
durability of microconcretes against chloride penetration was 
tested with Rapid Chloride Permeability Test (RCPT) method. 
Also, the carbonation of microconcretes determined by means of 
phenolphthalein indicator and the compressive strength of 
microconcrete specimens was tested according EN 196-1. Tests 
results revealed that the types of sand (aggregate) that were 
used in this study affect equivalently the properties of 
microconcrete, like compressive strength and durability. Also it 
can be concluded from the test results that w/cm ratio affect 
critically the properties of microconcrete. Finally, from the 
correlation between chloride permeability results and electrical 
conductivity of microconcretes it can be concluded that 
electrical conductivity measurements can be used as a rapid and 
non-destructive method to estimate concrete resistance against 
chloride penetration. 

Index Terms — Air Curing, Carbonation, Chlorides, 

Compressive Strength 


I. INTRODUCTION 

Reinforced concrete is the most widely used composite 
material in structural practices due to ease in applications and 
low cost of construction. However, the service life of these 
structures can be affected critically by a number of 
environmental conditions [1]. The majority of concrete 
deterioration cases is connected to reinforcement corrosion 
due to carbonation - or chloride - induced depassivation of 
steel bars [2]. The chloride ions and carbon dioxide, which 
can be found in high concentrations in various environments, 
when they penetrate into concrete matrix, they can cause 
corrosion of the reinforcement, which leads to premature 
deterioration of concrete structure [3]— [14]. Therefore, it can 
be said that the resistance of concrete against chloride 
penetration and carbonation has an important effect on its 
durability and hence on a concrete structure’s service life 
[15]— [17]. 

The main constituents of concrete are: Cement, Water, 
Coarse and Fine Aggregates. Although the basic properties 
and characteristics of concrete are mainly affected by the type 
of cement and cement hydration products, aggregates must 
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also be considered as an important constituent of concrete. 
Due to the fact that the aggregates occupy about 60 - 70% per 
cent of the volume of concrete, their impact on various 
characteristics and properties of concrete is undoubtedly 
considerable [18]— [20]. 

In this study, microconcretes (concrete without coarse 
aggregates) with different aggregate type (sand from crushed 
limestone, river silica sand) were produced and their 
compressive strength and durability against chloride 
penetration and carbonation was studied, in order to 
investigate the effect of aggregate type on concrete 
properties. The effect of w/cm ratio on concrete properties 
was also studied. Finally, the electrical conductivity of 
microconcretes was estimated from Rapid Chloride 
Permeability measurements. 


II. Experimental Program 

A. Materials 

The materials used in this study were Portland-composite 
cement (CEM II / B-M (P-W) 42.5N according EN 197-1), 
potable water according EN 1008:2002, two types of fine 
aggregates (limestone sand, river sand) and superplasticizer. 
The chemical composition and the physical properties of 
cement and sands are given in Table 1. Fig. 1 presents the 
Particle Size Distribution (PSD) of cement and Fig. 2 the 
PSD of limestone sand and river sand. The PSD of cement 
was defined with Static Laser Light Scattering (SLS) method 
with a CILAS - 1064 Particle Size Analyzer. The PSD of 
sands was determined according to ASTM C 136 - 06. A 
chloride free, polycarboxylate based superplasticizer (SP) 
(Sika® ViscoCrete® - 300) was employed to achieve the 
desired workability in all mixtures. 

B. Mix proportions and sample preparation 

Six different microconcrete mixtures were designed and 
prepared in this study. The mixture proportions are presented 
in Table 2. As it is presented in Table 2, microconcretes A, B 
and C contained limestone sand as aggregate, while 
microconcretes D, E and F contained river sand as aggregate. 
Initially, the dry materials were mixed together at low mixer 
speed and then water and superplasticizer were added. 
Superplasticizer was added at the time of mixing, in order to 
keep flow table (EN 1015-3:1999) values in the range of 
190 + 5 mm for all mixtures. 
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Table 1 Chemical composition and physical properties of cement and sands 



CEM II 42.5N 

Limestone Sand 

River Sand 


(%> 

(%> 

(%> 

Si0 2 

22.1 

- 

97.8 

ai 2 o 3 

6.27 

- 

0.85 

Fe 2 0 3 

3.55 

0.02 

0.17 

CaO 

55.97 

55.5 

0.1 

MgO 

2.2 

0.72 

0.28 

k 2 o 

0.71 

0.01 

0.65 

Na 2 0 

0.3 

- 

- 

so 3 

3.1 

- 

- 

TiO, 

0.31 

- 

0.035 

LOI 

5.23 

43.52 

0.2 

Blaine ( cm 2 /g ) 

4461 

- 

- 

Sp. Gravity (g/cm 3 ) 

2.96 

2.7 

2.6 

Cement Compressive Strength according EN 196-1 (MPa) 

2 days 28.9 

7 days 40.4 

28 days 50.7 




o 1 10 100 1000 


Particle size [pm] 

Fig. 1 Particle Size Distribution of CEM II 42. 5N 



Sieve opening size (pm) 

Fig. 2 Particle Size Distribution of sands 
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Table 2 Microconcrete mixture proportions 



Limestone sand 

River sand 

Mixture 

A 

B 

C 

D 

E 

F 

Materials 

kg/m 3 

kg/m 3 

CEM II 42.5N 

496 

519 

532 

487 

507 

521 

Water 

248 

207 

186 

244 

203 

182 

Sand 

1488 

1556 

1597 

1462 

1522 

1563 

SP 

3.7 

6 

7.6 

3 

6 

7 

w/cm 

0.5 

0.4 

0.35 

0.5 

0.4 

0.35 

Flow Table (mm) 

190 

186 

187 

195 

192 

186 


Prismatic specimens with dimensions 40 x 40 x 160 mm and 
cylindrical specimens with dimensions 0150 x 300 mm were 
cast for each mixture. Prismatic specimens (Fig. 3) were cast 
according to EN 196-1. Cylindrical specimens (Fig. 4) were 
cast in steel moulds and compacted with a vibrating table. 



Fig. 3 Prismatic specimens (40 x 40 x 160 mm) 



Fig. 4 Cylindrical specimens (0150 x 300 mm) 


C. Curing 

After casting, the cylindrical specimens were covered with 
a wet blanket to minimize water evaporation and cured under 
laboratory conditions for 24 hours. The prismatic specimens 
were covered with a plastic sheet and cured in a humidity 
chamber for 24 hours. After 24 h, all specimens were 
demoulded and left to cure under laboratory conditions 
(Air Curing - Temperature: 19 - 23°C, Relative Humidity: 
55 - 70%). 

III. Test Methods 

A. Compressive strength 

The 40 x 40 x 160 mm microconcrete specimens were used 
for compressive strength measurements. Compressive 
strength was determined according to EN 196 - 1 [21]. For 
each mixture and at each curing age (28, 90, 180 and 360 
days), three specimens were tested and the mean value of 
these measurements is reported below. 

B. Chloride Permeability 

The chloride permeability of microconcrete specimens was 
estimated according ASTM C 1202 [22]. This method, also 
called Rapid Chloride Permeability Test (Fig. 5), can be used 
to estimate the resistance (durability) of concrete against 
chloride penetration. The durability of concrete against 
chloride penetration is crucial for concrete structures’ service 
life, since chlorides, when they reach the reinforcement, can 
cause corrosion, which leads to concrete deterioration and 
finally failure of the structure. 

The chloride permeability of microconcrete specimens 
measured after 28, 90, 180 and 360 days of curing. In order to 
measure the chloride permeability of microconcrete 
specimens, cylindrical specimens with dimensions 
095 x 50 ± 1 mm prepared (Fig. 6). First, a 095 x 300 mm 
core sample was drilled from each 0150 x 300 mm 
cylindrical specimen, after seven days of curing. A 20 mm 
width slice was cut from the top of each core sample, in order 
to avoid possible inhomogeneities of the upper part of 
microconcrete specimens due to compaction. Then the core 
samples were cured under laboratory conditions until the test 
day. Finally, a 50 ± 1 mm width slice was cut from each core 
sample with a water cooled diamond saw and used to 
measure the chloride 
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Fig. 5 Schematic diagram of Rapid Chloride Permeability 
Test 



Fig. 6A®95x50± 1 mm cylindrical specimen 


permeability of microconcretes after specific curing periods. 
By applying a potential of 60V of direct current and 
measuring the quantity of electrical charge (Coulomb ( Cb ) 
passing through a®95 x50±l mm specimen, we can 
estimate microconcrete durability against chloride 
penetration. 



Fig. 8 Carbonation depth measurement 


C. Accelerated carbonation testing 

After curing under laboratory conditions for 5 days, two 
cylindrical specimens of each mixture (0150 x 300 mm) were 
placed in a chamber with controlled concentration of C0 2 
(22 - 23%). The relative humidity inside the chamber was 
55-70%. The bottom and top surface of the cylindrical 
specimens were sealed in order for C0 2 to penetrate through 
cylindrical surface (Fig. 7). After specific C0 2 exposure 
periods, a 20 mm width slice was cut from each cylinder with 
a diamond saw and the carbonation depth was determined by 
means of phenolphthalein indicator (Fig. 8). 

IV. Results and Discussion 

A. Compressive strength results 

The compressive strength of microconcrete specimens 
contained limestone sand and microconcrete specimens 
contained river sand is presented in Fig. 9 and Fig. 10 
respectively. From compressive strength results it can be 
concluded that for all w/cm ratios, microconcrete specimens 
with limestone sand and microconcrete specimens with river 
sand show similar compressive strength. 



Fig. 7 A cylindrical specimen with dimensions 
0150 x 300 mm prepared for carbonation testing 
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Fig. 9 Compressive strength of microconcrete specimens 
contained limestone sand 
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Fig. 10 Compressive strength of microconcrete specimens 
contained river sand 
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Fig. 12 Electrical charge passed through microconcrete 
specimens contained river sand 
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Also, comparing the values of compressive strength for 
different w/cm ratios, it can be concluded that the lower the 
w/cm ratio, the higher the compressive strength. A low w/cm 
ratio leads to a more dense microstructure and therefore to 
higher compressive strength. 


B. Chloride permeability results 

The value of electrical charge passed through 
microconcrete specimens during chloride permeability 
measurements is presented in Fig. 11 and 12. It can be 
observed from the results presented in Fig. 1 1 and 12 that for 
all three w/cm ratios, microconcrete specimens contained 
limestone sand show similar chloride ion permeability with 
microconcrete specimens contained river sand, for all four 
curing ages. Comparing the electrical charge values, it can be 
concluded that microconcrete specimens contained limestone 
sand and microconcrete specimens contained river sand show 
equivalent durability (resistance) against chloride ion 
penetration. Also, comparing the values of electrical charge 
for different w/cm ratio, it can be concluded that when the 
w/cm ratio is high (w/cm = 0.5), microconcrete specimens 
show high chloride permeability. 



28 90 180 360 


Age (Days) 


Fig. 1 1 Electrical charge passed through microconcrete 
specimens contained limestone sand 


On the other hand, when the w/cm ratio is low (w/cm = 0.35), 
microconcrete specimens show low chloride permeability. A 
low w/cm ratio leads to a dense and less porous 
micro structure. Therefore, microconcrete specimens with 
low w/cm ratio show higher resistance against chloride 
penetration, compared to microconcrete specimens with high 
w/cm ratio. 


C. Accelerated carbonation results 

In order to estimate the carbonation of microconcrete 
specimens contained different types of sand, 0150 x 300 mm 
cylindrical specimens were exposed to C0 2 for four different 
periods. The results of accelerated carbonation measurements 
are presented in Fig. 13 and 14. It can be observed from 
Fig. 13 and 14 that microconcrete specimens contained 
limestone sand and microconcrete specimens contained river 
sand show similar carbonation for all four exposure periods. 
Therefore it can be concluded from Fig. 13 and 14 that 
microconcrete specimens contained limestone sand and 
microconcrete specimens contained river sand show 
equivalent durability against C0 2 penetration. Also, from Fig. 
1 3 and 14 it can be concluded that when the w/cm ratio is high 
(w/cm = 0.5), microconcretes show high carbonation depth. 
On the other hand, when the w/cm ratio is low (w/cm = 0.35), 
microconcretes show low carbonation depth. 
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Fig. 13 Carbonation depth of microconcrete specimens 
contained limestone sand 
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C0 2 exposure periods (Days) 

Fig. 14 Carbonation depth of microconcrete specimens 
contained river sand 

Microconcretes with low w/cm ratio develop a dense and less 
porous micro structure. Therefore, microconcretes with low 
w/cm ratio show higher resistance against carbonation, 
compared to microconcretes with high w/cm ratio. 

D. Correlation between w/cm ratio and electrical charge 

Fig. 15 and 16 show the correlation between w/cm ratio and 
electrical charge passed through microconcrete specimens 
that cured under laboratory conditions for 28 days. 




Fig. 16 Correlation between w/cm ratio and electrical charge 
It can be observed from Fig. 15 and 16 that there is a high 
linear correlation (R 2 = 0.99) between w/cm ratio and the 
electrical charge passed through microconcrete specimens. 
Therefore, it can be concluded from Fig. 15 and 16 that w/cm 


ratio affects significantly and in an immediate way the 
durability of concrete against chloride penetration. The linear 
correlation between w/cm ratio and electrical charge passed 
through microconcrete specimens cured for 90, 180 and 360 
days show similar regression results (R 2 >0.95). 

E. Correlation between w/cm ratio and compressive 

strength 

Fig. 17 and 18 show the correlation between w/cm ratio and 
compressive strength of microconcrete specimens that cured 
under laboratory conditions for 28 days. It can be observed 
from Fig. 17 and 18 that there is a high linear correlation 
(R 2 = 0.99) between w/cm ratio and compressive strength of 
microconcrete specimens. Therefore, it can be concluded 
from Fig. 17 and 18 that the compressive strength of concrete 
can be affected crucially and in an immediate way from w/cm 
ratio. The linear correlation between w/cm ratio and 
compressive strength of microconcrete specimens cured for 
90, 180 and 360 days show similar regression results 
(R 2 > 0.95). 



Fig. 17 Correlation between w/cm ratio and compressive 
strength 



Fig. 18 Correlation between w/cm ratio and compressive 
strength 


V. Electrical Conductivity 

The results of electrical conductivity measurements can be 
used as an indication of concrete micro structure permeability 
properties. It has been reported in literature [23]— [26] that 
electrical conductivity (a) measurements can be used as a 
non-destructive method for estimation of concrete durability 
against chloride penetration. In this study, the electrical 
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conductivity calculated from ASTM C 1202 chloride 
permeability measurements, using (1): 


I.*L 

A*V 


(i) 


where 

• a: Electrical conductivity (S/m) 

• I 0 : The initial current that was measured at the beginning 
of chloride permeability measurements (Ampere) 

• L: Specimen’s thickness (m) 

• V: The applied voltage (Volts) 

• A: Specimen’s exposure surface to chlorides (m 2 ) 



Electrical conductivity (S/m) 


Fig. 19 Correlation between electrical conductivity and 
electrical charge passed through microconcrete specimens 

A high linear correlation (R 2 = 0.98) is observed between 
electrical conductivity and electrical charge passed for all 
microconcrete specimens (Fig. 19). From Fig. 19 it can be 
concluded that electrical conductivity measurements can be 
used to investigate the durability of a concrete against 
chloride penetration, since, as Fig. 19 shows, a concrete with 
low electrical conductivity will show greater durability 
against chloride penetration. 

VI. Conclusions 

Based on the findings of the experimental program presented 
above, the following conclusions can be drawn: 

❖ Microconcretes with limestone sand and microconcretes 
with river sand show equivalent compressive strength and 
equivalent durability against chloride penetration and 
carbonation. 

❖ The w/cm ratio affects significantly the compressive 
strength, the chloride permeability and carbonation of 
microconcretes. Fow w/cm ratio leads to a more compact 
and less porous microstructure and therefore to higher 
compressive strength and greater durability against 
chloride penetration and carbonation. 

❖ Electrical conductivity measurements can be used as a 
rapid and non-destructive method to estimate concrete 
durability against chloride penetration. 
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Enhancing Security Information and Event 
Management to Develop Future-Ready Security 

Operations Center 

Steffi Raju 


Abstract — The threats to the security of networking systems 
are on the rise. This had led to a continued need to implement 
effective monitoring of the events and the activities over the 
information network infrastructure via the Security Operations 
Center (SOC). It is in this context that Security Information and 
Event Management (SIEM) gains prominence. SIEM is a 
network monitoring technology for facilitating real time 
network monitoring for the Insider threats within a given 
organization’s SOC. It analyses not just the current security 
events but also evaluates these incidents with historically 
archived security log data to identify patterns in security threats 
and to help security architects make the underlying architecture 
more robust. The aim of this study is to enhance the existing 
SOC setup by incorporating new enhanced architecture and 
working procedures. It also aims to automate the testing and 
analysis of standard security controls using SIEM. All the above 
developments would help create a future-ready SOC which 
would greatly strengthen the overall IT security landscape of an 
organization. 

Index Terms — ISO 27001 Security Controls, Security 
Operations Center, Security automation, SIEM. 


I. INTRODUCTION 

Information security is a buzzword in the global IT 
landscape and is one of the most critical component in an 
organization. Nowadays, hackers are very advanced in their 
approach to break organizational information security and 
sophisticated mechanisms are being utilized to compromise 
the security over the network in the IT systems. These 
mechanisms are being launched either from within the 
network (internal threats) or from outside the network [1]. 
Previously organizations were passive responders to the 
threat and used to only react as an when a security breach 
occurred. However the deep financial, reputational and 
operational impact of cyber-attacks have prompted 
organizations to Computer Security Incident Response Teams 
(CSIRT) teams working as part of Security Operations Center 
(SOC). These centers proactively monitor security incidents 
in real time and take requisite action as and when vulnerability 
is identified by them. Security Information and Event 
management (SIEM) technology has been extensively 
deployed as part of SOCs to assist in the whole data collection 
and analysis process. 

SOC has its role to implement the application of the SIEM 
technology so as handle the enterprise level security. SIEM 
performs the correlation on the log information and the 
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network’s events in order to manage the risks over the 
network attacks. It helps in tracking the possible threats in the 
network; and it usually does so during the real time events. 
The effectiveness of the application of the SIEM technology 
in the monitoring of the networks threats depends on the 
ability detect the origin of the attacks or threats. Thus the 
Insider threats are usually in the network pool, where the 
organization’s network users- managers, workers or the 
supervisors access sensitive information over the network. 

It helps to consolidate and thereby evaluate messages and 
alerts originating from different IT systems in a centralized 
platform[2]. The SIEM systems are effective as they can 
comprehend large amounts of the security data and provide 
the raw data in a visual form which is comprehensible to the 
end-user [3]. Visualization is thus an essential part of the 
SIEM systems. Overall, SIEM help in evaluating the security 
of computer networks in a real-time or near real-time basis by 
monitoring security incidents thereby mitigating risk of 
information leakage due to the security gap [4] . 

Security logging is an old concept and has been implemented 
in organizations for quite some time now. However, in a 
multi-system enterprise environment a security logging 
mechanism would not be effective if the data produced time 
consuming to go through and complex to interpret. It is in this 
context that the quality data is important than the quantity of 
data produced[5]. SIEM enables to collect, store, find 
correlation, analyze the complete logs and present it to in a 
meaningful manner to the end-user. 

The current setup of SIEM requires considerable human 
effort in monitoring security incidents. This can become 
overwhelming for CSIRT team if the IT landscape is large and 
varied. Further the incident detection methods also rely upon 
singular metrics rather than a combination of multiple 
metrics. Furthermore, the standard security controls testing 
set in place via compliance standards like ISO 27001, etc. are 
also manual in nature. Considerable research has been done in 
each of the above areas. However, existing research has not 
integrated all the developments in a single platform. The aim 
of this study is to synthesize all the existing developments in 
the field of SIEM into an integrated framework which would 
help to develop a proactive and automated future-proof SOC 
system. The method of research is literature review based. 
The study would singularly focus on the different cores of the 
SIEM concept and would help identify the enhancements in 
each domain. It would first look at the concept of SOC and 
SIEM and what is the need for such an application. It would 
then delve into the SIEM architecture and how it can be 
enhanced to make it more effective. Following this, it would 
focus on the inner working of SIEM in identifying security 
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incidents and would propose improvements to the detection 
process. Once these two parts are done, it would suggest ways 
to automate standard security controls testing using SIEM. In 
the end, the study aims to provide a standard set of 
requirements for all SIEMs to have in order to develop a 
robust SOC. 

II. THEORETICAL FOUNDATION 

In order to understand the concept of SOC and SIEM, it is 
essential to understand the foundation behind it. This section 
aims to provide the brief overview of both concepts and what 
is the motivation behind implementing them in organizations. 
It also explains the architecture of SIEM and the background 
of how security evaluation of an attack is identified and 
evaluated in SIEM presently. 

A. Security Operations Center (SOC) 

A SOC is a centralized unit security monitoring unit in an 
organization which monitors security incidents on a real-time 
basis. It monitors the security events around the IT assets 
including network, firewalls, intrusion detection/prevention 
systems, application servers, database systems and lastly user 
accounts in an organization [6]. Each of the above assets are 
monitored constantly and SOC receives periodic logs which 
are then analyzed for any security incidents. It also 
proactively flags malicious events on a real-time basis which 
allows CSIRT teams to swiftly react and defend the 
infrastructure from attacks. 

The effectiveness of SOC depends heavily on its analytical 
and forensic abilities and how quickly it can analyze the data 
and report events back to end-users [6]. This requires an 
in-depth understanding of the entire IT infrastructure in order 
to perform correlation analysis. SOC is able to perform all the 
logging and monitoring due to SIEM systems which are 
integral to it. 

B. Security Incident and Event Management ( SIEM ) 

As mentioned in previous section, SIEM forms the inner core 
of the SOC architecture. As the name suggests, SIEM is a 
combination of Security Information Management (SIM) and 
Security Event Management (SEM). SEM performs data 
aggregation of the security logs in management information. 
It then creates security incidents which are tackled by the 
CSIRT. While SEM focuses on data aggregation, SIM on the 
other hand focuses on analyzing historical data and 
performing trend analysis on them to identify trends. These 
trends would help SIEM to flag events even before their 
occurrence, thereby improving the long-term effectiveness of 
information security systems [5]. 

SIEM help to consolidate and evaluate messages and 
incidents from individual systems components in a timely 
manner. They collect logs from disparate sources and 
normalize them into common standard representation. They 
further store these event in their rule engine which then send 
alerts once a rule is activated [6]. These security alerts are not 
only specific to single applications but can perform 
correlation analysis which makes it integrated across the 
complete IT platform. However all the advance in SIEM has 
led to an exponential increase in the number of security 
incidents. This, as per past experience in multiple 
organizations have shown that SIEM systems are complex to 
operate and require high resource effort to analyze all events. 


Thus in long term, security analysts end up neglecting SIEM 
systems on an operational level [2] . 

C. SIEM Architecture & Working 

A typical SIEM infrastructure has the below mentioned six 
core components as is described by [7]: 

a. Source Device: The source systems are the data sources 
that provide security runtime logs from the components 
within the entire enterprise infrastructure. It can be anything 
from application servers to firewalls, databases, IDS/IPS 
systems, etc. Since different systems have different syntax in 
data storage, the logs are made interoperable by SIEM. 

b. Log Collection: The logs from the data sources are 
collected by SIEM by one of the two techniques of PUSH or 
PULL. Push technique involves logs being proactively 
pushed by data sources into SIEM on a real-time basis, 
whereas PULL technique involves SIEM pulling data from 
source device on a periodic basis. PULL technique is safer as 
SIEM then understands what kind of data is collected. 

c. Normalization: The normalization engine is one of the most 
important component of SIEM. Different source devices 
lead to different syntax of log files for every source device. 
In order for these logs to be analyzed in correlation to each 
other, it is important for them to be normalized into a 
standard format. Normalization ensures that the original data 
from source devices are standardized to a common format. 

d. Rule/Correlation Engine: It consists of the rule and 
correlation engines. The rule engine is a repository of all the 
rules that are required to evaluate specific security events. A 
rule engine evaluates logs in the ‘what-if format which 
usually returns a Boolean value. While rule engines are the 
repositories for storing rules, correlation engines are the 
analytical backbone of SIEM. Based on the defined rules, 
the correlation engine analyzes log data to identify patterns 
of security events. Most attack types are not simple in order 
to be flagged on basis of specific rules. Correlation engines 
analyze the logs in the context of the entire infrastructure and 
thereby correlate events to flag the correct security events. 
Correlation engines use Artificial intelligence to reduce the 
false-positives increasing the efficiency of the event 
detection [7]. 

e. Data Storage: Data storage involves storage of both 
security logs along with the storage of SIEM related data. 
This data is critical in order to perform historical trend 
analysis along with maintaining the audit logs for future 
security audits [7]. 

f. Monitoring: Monitoring allows the SIEM administrators to 
interact with the application in order to access the data and 
also to independently analyze the data. This is normally a 
visually front-end for visualizing data in a more compact and 
comprehensible manner. 

D. SIEM Attack modeling security evaluation 

In order to identify security incidents, SIEM solutions use 
multiple evaluation techniques to evaluate and identify 
incidents in a real-time and accurate manner. They help to 
find and correct gaps in the network configuration, reveal 
possible security attacks actions for different security 
vulnerabilities, determine the critical network resources 
thereby choosing an effective security policy and 
mechanisms appropriate to current threats [8]. There are 
many approaches and algorithms for identifying threats such 
as malefactor’s behavior, generating a common attack graph, 
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calculating different security metrics and providing risk 
analysis procedure [8]. 

III. ENHANCEMENTS TO SIEM FOR A ROBUST SOC 

Having explained all the concepts around SIEM, this section 
aims to propose the enhancements which would help to 
develop a robust future-proof SOC infrastructure. Ray del et 
al. mention the main challenges that IT security professionals 
face in the security setup of their organizations [9]. The 
critical technical challenges outlined by Ray del et al. are 
listed below [9] : 

a. Variety of Source devices to secure: In a diverse IT 
landscape with multiple applications performing specialized 
tasks, it becomes challenging to ensure security of all 
applications. It is complicated to analyze basis the results in 
SIEM to ascertain the security changes required for different 
applications in a consistent and standard manner [9] . 

b. Quick Response to new threats: CSIRT members need to 
ensure that security vulnerabilities identified within SIEM are 
plugged quickly before they are exploited by hackers. This 
would mean a variety of security measures like installing 
system patches to big changes like re-configuring the security 
parameters of the application. These are time-consuming 
tasks and a quick response is not always possible. 

c. Lack of interoperability and integration of security tools: 
There is no standard tool which addresses all the security 
requirements of an organization. Teams have to rely on 
multiple security tools, each with their own distinctive format 
and usage requirements to get all corners covered in the 
security infrastructure. 

In view of the above challenges, it is essential to create a 
single standardized solution which eliminates the challenges 
and provides a robust solution to the security needs of any 
organization. This study proposes a three-staged approach in 
enhancing the SIEM solution. The three stages are mentioned 
below: 

A. Implementation of Distributed SIEM architecture 

Conventional SIEM architecture as described in above 
section is a centralized architecture with six components. This 
architecture becomes very difficult to management in a large 
organization. One of the main challenges in a centralized 
architecture is the problem of log maintenance. A large 
number of source devices can lead to large volume of logs 
generated from numerous sources which are inconsistent in 
content, format, timestamp, etc. [10]. SIEM greatly reduces 
the impact of the challenge by normalizing the data. However, 
the primary of large volume of logs still remains unsolved. In 
order to solve this problem, the SIEM architecture has to be 
decentralized and distributed as per the ‘Hierarchical 
Managers Model’ outlined by Anastasov et al. in [10]. The 
‘Hierarchical Managers Model’ extends the traditional 
centralized SIEM architecture by creating a hierarchy of 
SIEM servers that are connected hierarchically to a central 
SIEM server. Thus, the central SIEM server acts as a parent 
node and communicates with the child SIEM servers named 
‘Child Managers’ instead of directly communicating to the 
source devices for log data[8][10]. The entire process of 
collecting, normalizing, storing and monitoring of logs is 
done on the child level and only for data aggregation, 
correlation and reporting is normalized log data sent to the 
parent node. Fig. 1 illustrates the architecture of the 
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Fig. 1. Hierarchical Managers Architecture by [10] 


The main advantage of this architecture is that it introduces 
the advantages of distributed computing to SIEM. The data 
management is done by distributing the load across multiple 
correlation/rule engines thereby reducing the effort at the 
central node. Only the data for aggregation and correlation 
which is a subset of the data at the child manager level would 
be sent to the parent node for analysis thereby reducing load 
on the central node and thereby increasing the efficiency of 
their throughput leading to quicker computation times. Along 
with the SIEMs, the SOCs too needs to be distributed at 
regional level and only data required for correlation analysis 
needs to be sent to parent SOC [10]. This also leads to ease of 
installation and deployment of SIEM systems. 

B. Common Framework for Attack Modeling and Security 

Evaluation in SIEM 

One of the challenges outlined by Raydel et al. in [9] is the 
quicker and accurate response to new threats. The key to a 
quicker response to new security threats is to accurately 
pinpoint the threat in the fastest manner. It is in this context 
that the work of Kotenko et al. finds significance [8]. 
Conventional security evaluation algorithms rely on one 
technique for identifying security threats. Kotenko et al. 
proposed the Attack Modeling and Security Evaluation 
Component (AMSEC) architecture which uses multiple 
algorithms in a parallel manner to achieve near to real-time 
accurate identification of security threats [11]. The techniques 
proposed by Kotenko et al. as part of the AMSEC architecture 
for achieving this are mentioned below[8]: 

d. Usage of security repository and open security databases 
containing system and network vulnerabilities, attacks, 
configuration, weaknesses, countermeasures, etc. 

e. Generation of attack trees considering service dependency 
graphs and zero-day vulnerabilities reports based on 
Topological Vulnerability Analysis (TV A). In TVA, the 
graph generator computes the attack scenarios possible due to 
the vulnerabilities identified in the system. It would be based 
on both forward and backward analysis in order to cover all 
combinations of attack sequences. This would help to model 
critical attack scenarios which when occurred in sequence 
should be flagged as a possible attack. 

f. Application of anytime algorithms to provide near to 
real-time attack modeling. This would make the system 
effective to detect vulnerabilities at run-time. 

g. Usage of the generated attack graphs to predict possible 
malefactor’s actions: it does this by first creating the attack 
graphs for the profile of the malefactor selected by the user. 
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Following this, it would predict the future actions of the 
malefactor based on its actions. 

h. Calculation of a multitude of security metrics, attack and 
response impacts: Based on the skill level of the malefactor, 
the system would calculate the various metrics for the impact 
of possible attack along with the impact of the possible 
counter-response. The level of counter-response depends on 
the skill level of the malefactor. 

i. Interactive decision support to select the security solutions: 
In the final step, the AMSEC framework deploys a decision 
support center which incorporates data from all the above 
metrics and creates a decision support model which would 
assist users in taking the appropriate counter-measures based 
on the severity of the attack. 

The AMSEC framework provides a complete rounded 
platform for computing security incidents. It creates attack 
graphs which helps to compute all the possible attack 
scenarios which help in predicting and taking necessary 
countermeasures to preempt the attack. It would allow for 
accurate and faster evaluation of system and network security. 
AMSEC can be integrated into the rule and correlation 
engines to perform effectively. Coupled with this, the 
distributed nature of the SIEM would make computations 
within AMSEC faster, accurate and more manageable. 

C. Automate Security Controls testing using SIEM 

Until now all the steps mentioned involved proactive 
involvement of the CSIRT members in the effective working 
of the SOC. All the above measures coupled with the large 
and varied IT landscape would make the SOC implementation 
a very complex and resource-intensive system. To make it 
financially viable and less resource intensive, the SOC system 
needs to be made effective by reducing complexity of the 
overall architecture [9] . This can be done by automating the 
security controls in the framework as proposed by Raydel et 
al. in [9]. Security automation as defined by Raydel et al. 
involves “the automatic operation and monitoring of security 
controls by existing hard - and software security tools , 
reducing human intervention to a minimum”[9]. According 
to them, for a security to be automated, it needs to be 
completely in machine readable format with no requirement 
for human intervention for decision making. For e.g. security 
training cannot be automated as it involves the human 
component. Furthermore, for a system to be automated all the 
security tools must be managed via a centralized architecture. 
All these are factors which are the inherent characteristics of 
the SIEM architecture which thus makes it a prime platform 
for automating security controls. The security controls are 
derived from the standard security compliance frameworks 
like ISO 27001, compliance Audit Guidelines, ISAE SoX 
standards, etc. 

To illustrate the nature of an automated control, consider the 
A. 10.5.1 from ISO/IEC 27001 which looks into information 
backup. Automating this control via SIEM would mean that 
all the logging and monitoring of backup logs would be 
automated. Furthermore, in the event of a backup failure, the 
system would reschedule the backup without human 
intervention [9]. Raydel et al. have grouped the security 
controls that can be automated from the ISO 27001 
framework and are enumerated below [9] : 

a. Asset inventory (hardware and software): This control 
involves maintaining the inventory of all the network 
components of the organization. SIEM would help to track the 


inventory, its patch history, version history and installed 
software within it. It would perform automated patch 
installations and any deviations from the normal would be 
analyzed, prioritized and then reported to CSIRT team. 

b. Account management: This control requires the presence 
of an Identity and Access Management (IAM) system which 
creates, modifies, deletes and performs recertification of user 
and technical accounts on a periodic basis. SIEM can be 
integrated into the IAM system to automate the monitoring of 
user accounts activities along with the maintenance of 
Segregation of Duties (SOD) matrix and automated deletion 
of accounts on disable. 

c. Log management: Audit logs record events like network 
activities, security exceptions, user activities, exceptions, and 
other events. These logs need to be maintained for forensic 
analysis and audit reasons. SIEM can automatically collect, 
aggregate, analyze, correlate and provide proactive security 
alerts in case of any deviation. 

d. System monitoring: This involves proactive monitoring of 
all information security events and for detection of system 
attacks. This is the primary task of SIEM as it supports near to 
real-time analysis of event and also correlate data from 
multiple source devices. 

e. Malware protection: Organizations need to have malware 
detection systems at the critical entry and exit points in the 
infrastructure. They should daily check all systems to detect 
malicious code signatures and alerting the users. SIEM 
supports malware detection programs and can help in 
detecting zero-day attacks, backdoors, worms, Trojans, etc. 
via behavior analysis. 

f Vulnerability scanning and patch management: 
Organizations must scan their network components for 
vulnerabilities and must apply security patches on detection 
of one. SIEM can take it a level further and can automate the 
complete process. It can also perform correlation analysis 
based on the vulnerabilities identified and develop attack 
scenarios for exploiting the vulnerabilities and thereby 
preemptively stops an attack in its starting stages itself. 

g. Security assessment and compliance checking: 
Organizations need to periodically assess their infrastructure 
vis-a-vis the compliance standards and industry best practices 
in order to maintain the most updated security infrastructure. 
This is done by implementing a configuration monitoring 
system which would perform remote testing for secure 
configuration elements. SIEM integration with these scanners 
would lead to centralized analysis of the system reports and 
also dynamically alerting any event or incident that would 
cause non-compliance. SIEM can generate detailed 
dashboards with evaluation scorecards for tracking these 
checks. 

h. Information backup: As mentioned previously, SIEM can 
automate the backup process for all workstations in the 
organization and also take required steps to handle failed 
backups. 

i. Physical security: Physical security is in the context of 
restricting employees to only those areas of the company to 
which they need to have access to. Critical environments such 
as datacenters, development centers, etc. need to be off limits 
for employees. SIEM can integrate with physical security 
devices to perform security event analysis. It can alert CSIRT 
team in case there is a security breach and also identify the 
target for the breach. This would help in restricting 
unauthorized malicious access in its starting stages. 
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j. Incident management: Organizations should implement 
incident management systems that would effectively track 
creation of incident ticket for detecting, analyzing, containing 
the impact, eradicating and recovering the system from a 
security breach. Integration of SIEM to incident management 
would lead to creation of incident tickets directly once a 
pattern of an attack emerges and notifies the CSIRT personnel 
before the attack reaches its full-maturity. This helps to 
preemptively stop an attack and take steps to reduce impact of 
an attack. 

Fig. 2. illustrates the implementation of the above mentioned 
security automation architecture using SIEM as proposed by 
Raydel et al. in [9]. Currently all the mentioned security 
controls are managed in separate security applications. SIEM 
would lead to integration of all systems in a centralized place 
and help to be a one-stop source for all compliance activities. 
It would also automate the controls to a greater extent thereby 
reducing the complexity of the entire architecture. It can also 
perform correlation analysis across security controls which 
can lead to attack scenarios being computed from different 
security issues which otherwise would seem disparate and 
unconnected. This would make SIEM to become information 
security hubs to not just automate controls but also centralize 
all the security controls activities. 



Fig. 2. Security Automation architecture using SIEM by 
[9] 


IV. CONCLUSION 

This study looks at the concept of SIEM implemented in 
organization and proposes a framework for enhancing it in 
order to build the SOC of future. Security infrastructure has to 
move beyond the logging activities and look at information 
retrieval and processing from logs. It needs to evolve from a 
log-centric approach to a information security data-driven 
approach. The study proposes a triad of enhancements in the 
existing SIEM setup. Firstly the SIEM architecture need to 
move from a centralized processing environment to a 
distributed computing environment for effective and faster 
process of security event. It then proposes the revamp of the 
security metrics calculation and attack graph creation from a 
single algorithm based approach to a more integrated 
approach by implementing multiple security calculation 
techniques in a single architecture. This can be done more 
effectively using the distributed architecture proposed for 
SIEM. Once the SIEM architecture and internal working is 
optimized, the study then proposes to build SIEM as a 


strategic centralized security monitoring and response 
application by automating most of the security controls 
defined within standard compliance standards like ISO 
27001. 

All these enhancements would help positioning SIEM as a 
more information processing and security intelligence system 
rather than a log collection application. There is a great scope 
for SIEM to develop further thereby creating a robust, less 
complex and almost automated SOC system. 
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Abstract — It is well known fact that privacy has being one of 
the major concerns the photographs that might not want 
somebody else to see because of our wrong privacy settings 
might just get into the other peoples timeline. So, in order to 
avoid this problem of unnecessarily making every photograph as 
public or publishing the photograph with privacy policy we 
intend to develop a project which provides content based 
privacy to the images. Content Based Privacy (content based 
privacy means that the images is of different types for example 

images containing kids so and so) this images can be identified 
through image content matching algorithm every image is 
composed of certain combination of pixels. Each pixel represents 
certain texture, color, shapes. By properly identify the texture, 
color, shape of an Image we can find out category of image 
weather. Whenever we initially start accessing a site our 
behavior is remembered by the system that means which type of 
photo that we are sharing with friends, which type of the photo 
that we are sharing with family and which of the type of photo 
that we are sharing with everyone. 

Based on both metadata that is text information as well as 
image content that is color, texture and shape then as the user 
publishes new photograph every photograph will be matched 
with his previous image content. For example, if we give a new 
image then it will extract features and matches with the previous 
one it automatically predicts the policy. Then even if you forget 
to change the privacy setting of the images this photograph 
should be easily able to tag this means changing the privacy 
setting of the image such that only family members or other 
groups which we want will be able to see without setting each 
time. 

Index Terms — Content Based Privacy, Metadata, Content 
matching algorithm. 


I. Introduction 

Current time we share a lot of photos specifically in the 
social networking sites like “face book” and “flicker”. Now 
privacy is one of the major concerns with our photos there are 
few hobby photos that we take which we want everybody to 
see for instance, we go out some places we see some sceneries 
and we post that in “face book”, in “twitter”, ’’flicker”. Which 
we make want all our friends to see whenever there is a new 
born baby in the family we generally mitigate that photo 
through “whatsapp” such that our family members are able to 
see how about the facility is available in almost all social 
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networking. So for example the kids photograph we want to 
see we want only our family members to see similarly there 
were few photographs which we want only our college group 
to see. For example, a group photo taken in the class room, for 
example we share a study group and we take some 
photographs of certain notes. We want certain groups to be 
seen those photographs. In the groups also we might have 
various interest groups for example, we might be having a 
group which is say for example, archeological science group 
which shares and which uses the photo trap of various 
archeological science now whenever we post a photograph by 
default it goes to the default privacy setting for different user 
for different photos user has to select various privacy settings 
more often or not we tend to forget to set such privacy settings 
because it is quite a tedious process. We need to select a 
group, we need to tag a particular group and so and so far 
content based privacy is used to develop a project. 

We can find out the category of the image from the text 
that we put with the image, for example, somebody posting his 
new born kids photo will invariably write something like our 
baby born on so and so date or we are lucky an up to have a 
baby. So immediately we can understand that the photo is 
associated with the category called baby or its content is that 
of the baby of that person. Now as the person keeps on sharing 
his photographs with some privacy setting at the beginning for 
example baby’s photo for only family members, for example 
his classmates photographs only for classmate members for 
example sports photograph only for group associated with 
sports for example the photographs of various building with 
archeological survey group. So once the user starts seeking 
with privacy at the beginning the project remembers his 
privacy setting and tries to find out both textual content that 
means the content that we enter while publishing the 
photograph as well as the image content in the sense the value 
of pixel color, the texture value, the shape value so and so. For 
example, the text contents an image that we associate with an 
image known as a metadata. Metadata is do not really the data 
the data is image here metadata is the description of the 
images. So whenever we initially start accessing a site our 
behavior is remembered by the system that means which type 
of photo that we are sharing with friends, which type of the 
photo that we are sharing with family and which of the type of 
photo that we are sharing with everyone so as per the image 
content and data every new image content will be matched. So 
for example if you try to give a new photograph of a baby from 
your past data it should be able to automatically tell that this is 
the photograph of a baby. If you forget to make a setting on 
each image it automatically predicts the policy each time there 
is no need to change a setting of privacy large amount of data 
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can be uploaded at a time this is the overall project. 


II. Literature Survey: 

Due to the using of social sites a huge data is being shared on 
this which is violating the privacy so for this a survey has been 
done here and to prevent security a semantic annotated hidden 
Markova model is used to measure the annotated photos 
similarity in the database [l].To keep security protection in 
community, images need to be protected through different 
settings Here a protection of innovation prompt is used to 
share a data by the user this will fulfill the users end level goal 
[2] .As the leaking of personal data within the friends or some 
group it is not satisfied to the user so to handle this type of 
problem a survey has been taken place through which a review 
is given of different privacy settings for the user to satisfy their 
level [3]. To provide security to the image and shared data 
Images which are to be published System automatically 
annotates the image using hidden Markova mode and features 
are extracted [4]. Uploading a photos in the content sharing 
sites may leads to a violating the privacy to avoid this they 
solved it by providing a review through survey mainly to 
enhance the secure the personal information [5]. To 
completely survey for a security and sharing image privately 
by the outline of new projection saving method for labeling 
image on long range informal destination for the 
communication has been advised here [6] .The answer will be 
known that how the similar policies obtained by automatically 
generation of the policy on each uploaded photo so there is 
also a restricted to access on shared data and also how the 
effects on effectiveness is approached on tagging system [7]. 


III. System Design 


A3P Architecture 



User will input an image if already user had earlier put 
some image into flicker or something. Current images 
features will be extracted features means color features will 
be extracted. They will be compared with the previous 
images will get a metadata. Metadata means what were the 
tags of previous images, what were the security of previous 
images this policy will be extracted from by comparing the 
image policy which is already stored by the user once this is 
been done whenever this image is published this image will 
be published with the predicted policy. 

Tools and Technologies used 

Flicker API It is a social site where we can upload, share, 
tag, and view the image. Here we can upload huge amount of 


data the photos can be shared with friends or everyone and 
also we can make a setting within it where some groups can 
be formed. 

A Forge 

A Forge is a real time computer vision library for .Net so the 
statistics class is going to return three statistics red, green and 
blue because every image pixel comprises of red, green and 
blue these three statistics we are going to add in three series 
of the chart one is the red value one is the green value and one 
is the blue value. Once it is shown in the chart 

Machine Learning It is a simple learning which is similar to 
mining a data any data artificially can be learned in this. 

K-Nearest neighbor classifier Here as its name indicates it 
collects the nearest value that is only the nearest neighbor 
will be classified. It is also known as machine learning 
algorithm. 

So how do we find out the nearest number what we 
need to do to find out the smallest number is. 

First we take 

Small = inf 

Suppose we have values 121719 

Now compare whether 12 be smaller than infinity. 


1. SmaH=inf 
121719 


If yes then now small will become 12 and index is 0. 


=mall=12,G 

121719 


Then compare whether 1 is smaller with 12 or not yes it is so 
small value become 1 index will become 1 . 


3. Then compare 
12 and 1 i.e 
small=l,0 
121719 


Then compare whether 7 is smaller than 1 or not 
7<l,No. and 
19<1, No 

So at the end of the loop we will get which index has got 
the smallest value 
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IV. Methodology: 

User should create a flicker account he should create an 
appropriate group. He must invite other friends to different 
groups. The friends must join the group then he needs to run 
the application. In the application user should browse specific 
category of image put all the necessary metadata use the 
privacy setting and upload the image. When user uploads an 
image the image features that is texture, color, shape will be 
extracted and the metadata will be extracted it will be saved in 
a data base. Every time user uploads a new image that will be 
compared with the previously uploaded images. If any of the 
previously uploaded images is closure to the new image 
automatically the privacy setting will be changed to the 
privacy setting used for previously uploaded image. If the new 
image is completely a new image and does not have any 
relationship with previously uploaded image then user will be 
prompted for new privacy settings which will also be saved in 
the data base. Once the user shared the image this image 
should be available across the internet in the same site so other 
user should be able to view this image as per the privacy for 
example we will login to this system with one of our friends 
account which is already accepted our group request in flicker 
who is part of the news group we need to show that that user is 
able to see all the news related photographs and not the other 
photographs, for example another group which has got our 
friends who are part of sports group we need to log in through 
their account to the flicker and then you should show that they 
are able to see only the sports related photographs that is been 
shared by us. 


V. Result Analysis 


Image Table 
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The accuracy is 42/50 which is 84% 


VI. Conclusion and Future Scope 

With the popularity of the social networking sites every day 
millions of photographs have been shared in the social 
networking site. This increase the risk of misusage of the 
photographs been shared many a time user forgets to set 
appropriate security and privacy setting for the images that 
have been shared across the social site. In this world we have 
proposed novel mechanism to guide the user to automate the 
process of ensuring privacy setting for the images. The 
proposed technique first learns from already shared images by 
the user about the pattern and then classifies any new image 
that user intends to share in social site of flicker as either 
private or public result shows that the proposed system can 
predict the privacy setting of the images with an accuracy of 
over 80% therefore this can be used in a large variety of 
application and domain including “face book”, “twitter”, 
“Google plus” and so 

This work can be further improved by replacing the 
K-Nearest neighbor classifier which is a primitive classifier 
with more advanced classifiers like neural networks further 
more security settings like sharable within the group sharable 
within the family or others could be incorporated as a future 
work to extend the domain of privacy settings of the images. 
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